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0.0 INTRODUCTION 

A data flow computer it a maeMna with architecture radically different from 
that of existing computers. It can perform computations stmuttsneousiy on many different 
parts of a proa/am. A typical data flow computer hat many arithmetic processors, and can 
utiUza aH Of than simultaneously, aach executing • (iffsrefit instruction. 

To handla arrays and other data structures, # data Mow computer must have a 
data structure processing facility and memory that hat a similar facility to parform many 
operation, concurrently. Such a data structure memory is tbesubjoet of this thasis. 

A data flow computer owes its greet speed to its ability to perform many 
operettons at once, even though each individual operation is no fatter then on a conventional 
computer. The same is true of the memory. The memory to be presented here has a 
retrievel delay just as great at conventional memoriae, ejeae no new circutt technology wHI be 
proposed. However, it has an enormous data transfer rate because of ttc abWtty to handle 
concurrent transactions. This concurrency is mads possible by en unusual rype of interface 
celled packet communication. 

Section 1 of this thesis is sn overview of data flow computers and the type of 
memory that such a computer requires for structure processing. Section 2 is e treatment of 
packet communication system^ thowtag how their behavior is defined. In section 3 the basic 
memory unit is described, along with a "cache" mechanism and an Intorteeving" method to 
improve its performance. In section 4 an implementation of the memory using shift registers 
or magnetic disks will be given, showing how the. disadvantages of such device* can be 
overcome through the use of packet g ommun i cstica Section 5 examines some aspects of the 
processing unit that uses the memory, end section 6 examines the NteadtecK" problem end the 
cost of overcoming It. Section 7 presents suggestions for future research. 



1.0 DATA FLOW COMPUTERS 

As the need increases tor ever faster computers, (M* technique for improving 
performance that ha* drown considerable Mmit IdH«M(m years "1» a radically naw 
dasitnlwownaaodjIilojtaajB^minittHtQ. A c onvention a l computer h,, only 
ona locus of control, that fc, «na point in the program at any given thetent at which 
instructions ere executed* AWMty teexeexto mare ttwn mm Instru ct* * «t a time can improve 
performanca stf nif icantty, and soma computars wa an instruction laohahaad to achieve this [3] 
[9] . However, the beeesite el l o ohon ta d method* era ttmHodV and such computars ara 
enormously complex. Other attempts to increase instmtten concurrency includo "array 
procassors- Ufr), but tush w ap Mmi ara in ft s x bjo j and a ttr am oty dMftcu* to program, 



A data flaw computar achieves aw acMtt a wal coneurraney by using a different 
internal representation of the source program, instead of repr es ent i ng the program as a list 
of instructions to be executed m a aerHeumr order, ths program la represented as a data flow 
{cJsSSr A date Mow schame la a directed graph whose nodes represent instructions end 
whoso arte show the data d spsn d ema --among mstrutttons. The order of instruction execution 
is determined eeteiy by the date dseendsnce - sn mstructmn is aiwcut^ whsn *H of it. dsts 
sources have produced results and sit of its destinations are ready to receive data. This 
allows many instructions throughout ths program to be executed simtrtaneousty. 

The data In a data now program can be i i w i iw d by tefcons' that reside on the 
arcs of the graph. Each arc may contain at moat one lata* The execution ride for most 
instructions ie ao f oBowat 

An instruction tether than a marge or gats) » ready for omioetion whenever aN 
of (to input arcs contain tokens and at of th output arcs are empty. When an 
instruction is executed, the tenons on the input ores are absorbed. The 
function denoted by the instruction ie com p uted, uttng the values In the 
absorbed tokens as input oats. A token containing the function value is placed 
on each output arc 



There are a number of ways of Handling decisions and iteration control. 
Perhaps the simplest is the use of special instructions k£ J, and F. These receive a boolean 
value on one input (the "control" input) and use it to control the passage of data from another 
input' Their execution rutte art as foNovts 

The M (merge) has a control input and two data inputs labelled "T" and V. To 
be ready for execution, there must be a boolean token on the arc leading to its 
control input. Furthermore, the arc leading to whichever of its T or P input 
matches that boolean token must have a token, and ait output arcs must be 
empty. When it is executed, the control token and the data token at the input 
indicated by the control token are absorbed. Copies of the token at the 
selected data input are placed on each output arc. Input token* are not 
required at the non-selected data input, and if any are present they- are not 
absorbed 

The T (true gate) and F (false gate) instructions have a control input and a data 
input. They are ready for execution whenever both input fcs contain tokens 
and all output arcs are empty. When they are executed, the inputs are 
absorbed. If the control input matches the name of the instruction, copies of 
the data input are placed on the output arcs. If not, no tokens are placed on 
the output arcs. 

Constants can be generated through the use of functions of no arguments. An 
instruction to perform such a function has no input arcs, so, in accordance with the execution 
rule, it pieces tokens on its output arc as fast as they are removed. ....... 



Here Is an example of • data Hew sch a ma to compute the factorial f unctfoni 




Boolean inputs to M, T, and F instructions art drawn as open arrow*. Tokons existing in the 
initial conf juration of tha program ara drawn at fmed-ln circles. 

The behavior of a data flow schema under tha execution rules has a very 
important property - K is determinate. This means that the output of tho program is 
determined only by the Input, and is mdepandant of the timing of inatruction executions. All 
runs of such a program with the same data wW yield the aame reatdts. Daterminecy follows 
from the facts that 

(1) Each instruction produces a result which w a function enty of the vetoes of 
Its Input toaons, that is, each nods of tha schama la dt tor m t n a to . 

(2) The velue of a token does not ehange in any way while ft roaMea on an arc 



<*> The execution rules, and fact (Z) above, oueHfy the schema as a valid 
interconnection of autonamoua co mm u nicating systems. 



It is an established result that such an interconnection of determinate systems la daterminate 

mem. 

1.0.1 DATA FLOW COMPUTER ARCHITECTURE 

The mamory system end structure processor that wn the subject of this thasis 
ara intandad to be part of a computer of tha typa described by Dennis and Msunas [6] [7] . 
Such a computar is composed of units which uaa packat communication T8] for transf ar of 
data. Tha only maans of data transmission among thasa units is tha transmission of fixod sizo 
massages catted pacKats. Thara is no ctecK or synchronizing information. 

Tha four main parts of tha data flow computar art tha Instruction mamory. 
arbitration network, functional units, and Attribution network For structure processing, the 
structure controlter and structure mamory ara added. 



distribution 
network 



arb itratio n 
network 




«] instruction 
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To execute a data flow program, its schema is encoded into the instruction 
memory. Each cell of tha memory contains one instruction of the schema. At the time the 
program is loaded, each cell is filled with tha operation coda (arithmetic operation, merge, 



to 

structure operation, etc.) and the address of it* destinations. The letter are the cells to 
which outgoing ore* point. Tho instruction cent otto have receiver registers to contain 
incoming "token** When at necessary receiver re gis te r * becem* fte% an instruction ceil emits 
*" gMS^jgSL Eg&fk cmwrnt l ng of it* o pe ration ooae, the dot* from lb* receiver registers, and 
ine ownMnon 



Any given program ha* a groat number of instruction eeH*» each sending 
operation packet* only oc c a a i on al ry . Thee* stream* of packet* are me rged by Hie arbitration 
network into a smaR number of dan** str e a m * . The pockets coming out of the arbitration 
network are sorted accordtng to operation coda end emt to the aeorooriate functional units. 
In the case of structure p roc eas in g instructi on s, they are sent to the structure controller. 
Tho functional unite or structure c on trol * * perform the Indicated operation and form, for each 
destination, a ri+uJt packet conststtrg of the dactjnaften address and a copy of tho actual 
result. The result packet* go to the oUtrifotion network, where they are sorted by address 
and sent to tho app ropri a te receiver regwier of tee ap praoria te instruction celt. (The 
destination e dd ra** i nclu d e s the receiver mj m h a r .} M the Instruction is • structure operation, 
the structure co n tr otte r may send numerous command packet* to the m e mor y and receive 
result packet* back during the course of Ms c ompu t at i on . 

Tho p rac e d t ng description doe* not euite imat o ms n t the execution rule: An 
instruction colt should watt until it* "output arc*', that m, the r * c*K*ers of it* destination*, are 
empty before i aau tn g en o pe rati o n packet. There i* no way tor *e instruction ceH to "see" its 
destinations* receivers. The pr e bm i w is r * m * d i ad by using, where n a co s tsr y, acknowledgment 
token* sent from e eoN** d ss t i no t mn s to the ceH itosft Hw s cknowtedg as ere treated like 
invisible arguments, except that they contain no date. ¥**n a coM « executed, it may send 
result packets to some destinations and acknowledge* to other*. A eoN is not ready to be 
executed until it ha* received *N necessary real a rgu ma wU and att n*cos*ary acknowledges. 
Acknowiedge* are placed in the program where nataaaary to ensure that, when a cell has 
received ad argument* and acknowledge*, it* destinations' receiver registers wW be empty. 
Thee* acknowledges should not bo confused with the packet aek newl e dg o* to be developed 
later. 
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A constant naad not bo implamantad m ■ ssparata nodo of tho data flow 
achama. It con simply bo toadadJnto ths raeaivar ragittar of tha instrucMon eaU that utos it, 
and markad in such a way that tha instruction caH luiows that that ragistar it always fuii. 

An additional part of tha data flow computar, not shown in tho pracading 
diagram, is tha host comoutar. TNs is a computer of convanttonal das^n, which has accoss to 
tho momory units and control functions of tha data ffow computar. It is utad for diagnostic 
tasting and for initial loadtaf of tha instruction mamory and struatura momory. It dooa not 
partkipata in tha actual data flow computation. 
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t.l DATA STRUCTURES 



fevhenale arre>» endaota sbiaJhii ei ar« data flaw computer, It Is In 
most (MM nacaaaary to stew atagle totems in Have enfire structure* a* their values. (Soma 
programs which use arrays ef fixed site, Mich as Fourier transforms and other signal 
processing atojrtthme, con-mahe 4a a**** arraya of t iu drutR a m UAUr ana tons* an each arc 
l la w e v e r , Inta approach m la a jraet l c a t far usry targe a rra y a ar - iar a V n aa ic structures.) For 
this r easo n, we esaoaao a dato struehire fsdPtv ftmt atew* tofc a as to hava structure values. 
Trnv s i m pta st typo of sfrud u re that par i iti lattfaMraaY* Is 4Mb tonary tree, which is 
recursively afeftaodt a binary Iroo la an a wmentary i a ojiit * •*■» oomo aot, or ia a 
concatenation of two binary trees. Such traaa form the basis far alio aifajerasoning ianguega 
LISP. [4] [13} For daflidienass, the structural used at a data now aaaujutsr wMI ba assumed 
to ba binary trass. 

Tha "atoaiantary objects" arc aH data vetoes other than structure* that the 
computer can handle, Blue tha apodal objoet pJL B ama nt ar y objacta thus might include 
integers, baolaan values, reefs, ate 

Tha prtndpd epsrafcen an a data structure ia selection. A simple saiaction 
takes a structure and a single bit If the structure Is e ie ma at ary and not nj^ tha result of tha 
sataction is undeftaed V the structure hi g£ the result is fA O th e rwi se, toe structure is the 
concatenation ot two structures) ana wa resuff at sas aaascoan n sae nrat or seceno or inese 
if the bit is zoro or ana reeeacthvty. A ca w o und saioction tahss a structure and a string of 
utw, aiu arw* nw ihm ot «ppqriiaj •mipis suotphbiw rvpauiBBirt uvmg ww ww in 
Tha bit string ia celled tha satactar. tat S ba tha faaaaang structural 
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8ELECT[8,T]-5 (■ simple Miction} 

SaECTtS, Wll - SELECTtSaECTCSELECTtS, Wl HT1 '1'] - 4 (• compound soloction) 

The true "mMnint" or "Value" of a structure can bo dafinod to bo the sot of 
ordered pairs of selectors that yield elementary valuas other than nM, along with those values. 
Thus the structure S denotes the set 

{ «000', 1>, <"00T, 4>, «01 1', 3.14>, <T, i» } 

NH simply denotes a substructure with no elementary items at all 

Iking this definition of the meaning of a structure, there is a structure 
corresponding to any finite set of ordered pairs of selectors ind elementary votoae (exdudtng 
nil) such that no selector in the set is en initial substring of another. The structure ntt 
denotes the empty set 

SELEdtstruc, sol] - 

The elementary vaiua v if slruc contains tha p^r <eat v> 
IMoeflned if <s, v> c struc whye a is a proper tattiat substring of eel 
Trm structure {<•, v» | <sel% v> c strue } otherwise 
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Structure* can be built wiiH tba s o asmt op s r s ttaw, APPEND pJetaa • mven abject (structure 
or elementary value} ante a tJMen structure a*lh> a gate* selector, removing whatever 
aubsttueture p wu i ouah/ i wa itii thsre. la tte> asMfcearebe mtoat, 

APPtNQEstrue, naw-vaV set) • 



(•true -{ < **v > tcmeeleelarsisan feritfe* la b str i mj at the eJfcerft U { <sel, new-vel> J 

(•true -{^v»tai^at«#er»i»<ft»iWi»i u li rt i i ii » aia<aa*N»r^U 

* ^seHe* v^ 1 as> a*' at new^veji a esa^m! It a tMMNMt. awmaiboi bh« 



UKHfej § be we struttum aanMMl previousiyt APPEMBfS* *» w*J is 




The aubsttueture com^amlng ntt ami 3tl* 



Structure can ba i mp is m a wt ao 1 aa a aata tew computer m tha same way that 
thay ara c om mo nly imp ts m s nt s a on or d bmr y co m pu t ers - a» Nuba* Nat* af "aatt»* in a memory. 
An elementary object m r ap raa a wtea ' by Mat eejact it*** A ca m i t a wat i a n ia represented by 
tha aeaVaca in memory at a celt cetacean* tha r e p re sentation* al the two subatructuras. In 
either case* a structure ia repreeef*e#by aaani a iis wt of Waa i m ot i o n , The hafe amount of 
information thai eermtNutee tba structure ttsetteat bubH tba memory, ami the ree*eeentation 
is merely • pointer te thie. Tba operetta* of aalattam w cult* ebepta. Cab* ere reed from 
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memory and the appropriate halves of the data used, undar control of the solaction bit*. 

1.1.2 SHARING 

Such an implementation toads to tha possibility of a single structure in memory 
beinf shared (or partly thered) by aevaral parti of tha computation. In a data flow computer, 
two tokens might have tha same pointer as their vatue. This is of course very desirable for 
economical memory use* but it makes tha APPEND o pe rat i on e¥f iautfc Two probtem ia that 
modification of pointers inside the memory CM«haegethe«elue«f structures other than the 
intended one, if atructures have parts in commoa In atany p i ogjowmi i g Hwguega*, thtt ia 
conaidered a reasonable and even desirable effect. For example, the LISP language has 
instructions to modify existing structures. In a data flow computer, however, this cannot be 
permitted for reasons of determinacy. In order for a data flow computer to be determinate, 
the meaning (in tha set-theoretic sense given previously) of a token bearing a structure value 
must not change while that token resides on an arc. Since other instructions, including 
APPEMTs, can be executed whHe a token resides on en arc, APPEND must mvr change any 
substructures that ere shared with other structures. 

In the proposed structure processing facility, each cell has a reference count 
which makes It eaay to teN what substructures ere shared. Whenever the APPEND processor 
is tempted to modify a cell that is shared with another structure, it makes a copy of the cell 
and modifies the copy instead. For example, if S is a pointer to the following structure in 
memory: 




where the number in each node is tha reference count, APPENDS, 7, XH*] yields 
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3.14 



The node that originally had a reference count of two may not be modified, so a copy is made, 
and its reference count is therefore reduced to one. The structure controller to be described 
in the next section will perform these tasks. 



17 
1.2 THE STRUCTURE CONTROLLER 

In this taction wt will outHnt tht behavior of a processing mechanism that uses 
the structure memory to provide a structure facility for the data flow computer. The basic 
behavior of the structure controller is that it receives operation packets from the arbitration 
network end delivers result packets to the distribution network. It tolas the state information 
for structure operations in progress, end performs memory operations by sending packets to 
the memory and receiving packets in return. 

The purpose of this section it to show how the structure controller will use the 
memory, rather than to give a detailed specification for the structure controller. Therefore, a 
number of design decisions win be made arbitrarily. For the most part, the requirements of 
the structure memory are independent of these decisions. For example, the memory design 
would not change if ternary trees were used instead of binary ones. 

Some aspects of the design of the structure controller will be considered in 
more detail in section 5. 

L2.1 DATA FORMAT 

The memory space is divided Into "word*" or "cells", eech of which holds one 
node of a structure. Since the memory is used for the storage of binary trees, the words 
representing nonterminal nodes contain two pointer* to other nodes. Th* convention will be 
made that all words of the memory wHI be divided into helves, celled the left half and the right 
half. Each half has sn "elem" bit bit indteetes whether it contains en elementary item (terminal 
node) or a pointer to another word in the memory. If the bit is 1, the half word contains en 
elementary value. The interpretation of that half word is then the exclusive responsibility of 
the rest of the computer, unless it is njL The structure cenfrolier beets any elementary value 
other then nH simply as a collection of bits. Any type information (integer* floating point 
number, character, etc.) must be encoded into the half word along with the data. 

The structure graphically represented at follows: 



m 




might be realised by addree* I0t totfce foHewtog 



* 



4ffr 



lo51 
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The bfe at the left end of eaefctwtf word f* fee "Mei* e* 



<*dtf<orewfc«o« v ai ^ «»uttbetieed,fr wN e h vmk a le n ai it ei j . vetoe takes en 
mfikm word In s tead of ha* a war* Thefwn «nrwo«Hew erainoalh aiwmful» and differ onty 
•Hghtfy b»e «o c idtoa The ^ word" c o n v a ntto n wf» bo mod for do fattenaoO 



i^a 



Aft werdfrOf ftwewpy that «-* net part e# • strecfor* are toe* in a collection of 
if& fffcere eaeeeeaaoroaelHhtt, raBtoi iften eoe^teeadbr formeintain • high 
rate eft p » "n il * IMe ee#* wift bo dtoctwae* to aeetlan 5A3.) Whenever the structure 
controller need* o word in order to create » node, it fojtoeft from one of the ftots. Whenever 
a node to destroyed, that to, eft painters tort dtoaeeeor, the word tartaNr* It to returned to • 
free storage Hst 
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Each nod* of a structure has p raforeac* count, which i> in* number of 
pointer* to that node that exist, ybather in other nods* or in the rest of the computer. (The 
Utter includes operands waiting in instruction c*H» and pecaots te transit through the 
arbitration and distribution networks.) Th* structure controller inereeees or decreases the 
reference count of each node as pointers to it are created and destroyed. When the 
reference count is decreased to zero, ths node 4capp*«ra,*ett « rrturrisd to a free storage 
Hst Whenever this happens, any pointers that the noes contained s dws p p eo r, and so the 
reference counts of the nodas pointed to must b» deer eased. 

The choice of a referenca count strategy for memory management instead of 
the "mark and scan" method commonly used in USF? systems was mads for three reason*: 

(1) The mark and scan method requires a garbage collection operation which 
must find every referenca to every structure* Sine* reference* exist in 
packets in trsnsit, it would bo necessary, to stop the entir* computation and 
wait until aH packet* atop moving balof* «. garbage cetectton comm e nces . 

(2) The reference count is naodad anyway in order to Imp i am e n t th» copying 
rule efficiently. Whenever ths structure control needs to modify a node 
as part of an APPEND opsr.tkm, it may do so safely # th* reference count 
ht one. If not, the nods must be copied 

(3) The objections to the reference count mathod in many lift processing 
systems, that it is difficult to recover circular HtU, doe* not apply hare. 
Bacause of the copy rum, circular lists ere never created. 

1.25 THE STRUCTURE OPERATIONS 

Th* structure controller to be proposed iiaplemonts the following program level 
operations: 



2& 

Sa^CTfeiructuna, seJectery - The selector w a, bit string of infinite length. The 
structure ie tio j tjt ujmbjr conteet of the bits in the teWtctor, starting with 
the mfl iae si fcik A wwb^ wiwte »» Mt B W»p t iflt in! > ena bit sewctt 
feerbnatfe The iseatr** the *et»*h»d peter fcv the e i ruch w tat returned, 

•B^BBaPaVhaOitY- 0& OOfe aaaBBUBBUBUBufr'amnfe*, AM SB, aMaBBe*BVm***>aBkaaMk 

'"■nww «* we* vjtwvoavpwnowy W V HIOTirVvlwvb 

APrerCtstruchye^ohjaet, »>ntar > - R e turns » at ruoheo thaBor to the given 

one* but htMng ttjoobitct ar^theahtcef sptaifled by Hue eeJeetor. Whatever 
wee et that ekce ie the ertfimt *r**ehre » time** to the result. The 
object mey be elementary or » structure. Any pert of the original structure 
the* ieeharee^ wtto> ethei eeHe of trw olMtfpJIitwn is not modified. The 
c oi^ f o tt o r capt t i pert or et Of the erifinst structure it rmcesstry to be 
sure that this it the case. 

The ehjuatujQi c on tr ol l er * s ingja i y m u the specie* constant njt which, wMIe 
elementary, n atee the structure with* no teteeter*. m h used at a terminal node of a 
structure to i n d ium that those are n» object* be yond that eoM Any pert of a structure 
may be delete* tmwty by using »» APP9© cnmr then to reatoM ft *8» njj, end a structure 
mar bo cr lot to. be ii pj i i i i ii g wn i thli i g lo rdl. It it attorned that the constant n» it explicitly 
avertable h> the su p ' t urn ii ' <o» thtoa p ji i 'pwo i i Tf* cea<refl* oabmlzes oH structures, 
roptocint waft eg a ny mmoh u uhi) t otal « h s ti t a niw t i i w j ots arerjl 

There are two more operetta* performed tmpitciHy by the controller. If any 
operation returning a stru c ture u t bj o i te tffits mare thaw one taatrntbais the reference count 
of the rote* matt be spp r sp rlai i iy fotr asis a. Mot, if toy eperetttn tfacards a structure 
value, the ro foro n c e count must be uncrossed It tebawa that- the conation*! operations such 
as true and fame otter* must be eaoauted by a structure centroiier if the object* being 
switched are str u ctu r es. 
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1.2.4 THE MEMORY OPERATIONS 

The structure controHtr communicates with the memory by sending command 
packet* and receiving result packets. These packets are given names describing the 
operation to be performed 

To reed • word of memory, ■ FET ("fetch") packet is sent, giving the address. 
The memory returns a LOAD packet with the data. Bttwaen tb* FET end the corraeponding 
LOAD, many other packets might be sent and received. This it a consequence of the 
parallelism of the data flow computer: just at with the other functional unite, the rote at 
which structure operations are performed can be increased by allowing many operations to 
be in progress simultaneously. This concurrency is made possible by the use of packet 
communication at the memory interface. The FET packet that begins an operation and the 
LOAD packet that ends it are distinct events and might be separated by a great number of 
other packet transmissions and receptions. Each LOAD packet is identified with the FET 
packet that caused it by means of the tag", to be described later. 

Each LOAD packet contains the address of the word end its reference count, es 
well as the data. The address is probably not used by th* structure controller, but « included 
as part of the specification of the memory module because H is needed by the cache 
mechanism to be described in section 3.2. The structure controller uses the reference count 
in order to toil when a node may be written on without being copied (if count - 1) and when 
e node should be destroyed (if count - OX 

To increase or decrease the reference count of a word, the FET* or FET" 
packets, respectively, ere sent. These are similar to FET, accept that the reference count is 
first modified. The memory replies to them with LOAD* or LOAD' packets which mf shutter to 
LOAD pockets. In some cases the structure controller does not use the data in a LOAD* or 
LOAO" packet, but It does not really cost anything for the memory to send it 

To write on a word of memory, the structure controller sends an UPD 
("update") packet giving the address, data, and reference count. The reference count is 
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presumably one, bet the teectmttton «f the memory module ettowe an arbitrary count to be 
given. (In an achia) liwe l emo w tation of a structure t m rt io flo f end memory, unnecessery fields 
would bo omitted where eeawtte, ee that tta ««n»r«iir ««M tiM i^ • rafar«r«« count m 
Un»f«M»H«rfw*e«»«*fJe^^ The Mtitwry sends no 

rooty to on UTOoeehtt 

There is another command that tt* memory retegniaes. the CLR packet waits 
until aM powMnt eeeratter* m tha gem* wort ara wnploto, arrf than returm . DONE packet. 
Rw*t*iMe*#»etPWRlniee**^ 

1^5 THE TAG rtfLO 

©rety «T* f*T*, or ftr oatfcet hot • he* teftod the tag* flow that 
co«a«ti(h» a rommdet from the efruehjre tartrate)- It fttetf, teftmg ft what tb do with the 
roaolt or the eeeretto* fht tog *o1d of a cowm o wd ao*ot t» rotomad unchanced in the 
reaytt packet. 

Conaidar the eete of a etmpie ttLBCT tngtrettam When tha instruction ceil 
fires* snee^eam* peem* gees to tha imimm m*Mm w^m mm**^ *^> \* 
•tructuro t tnoso»otO»,amlthe a e*l ^ «^ receive the 

roaolt, Thora might tyofcafy bo three such d i mnentu addresses, each about 20 bits long. 
The structure eo nho t o) m ttagt? mm tham tha tig fieW at tha •fetch* command to the 
memory, and then uee them whan they coma baeft in the result packet, tn the case of more 
eomoncatod structure operations, such aa AWen wtm l o mp e Mnd selectors, thora is a large 
amount of state information that meat ba r ame m b arod through tha many memory transactions 
ttot make up the stutter* eoorsttwv to addition H tha do ttlnohon tddroaaos, there is the 
datum to bo so fn aa d, tha structure tt ba utttaeteiy returned, tht romothtng somttor bits, 
•nd a few poster*. The tot* etmjwd* such aeHrypteiiiym^ 

There ore two ways of handHng tfds information. One method is to include ell 
of it tn the tag field of commandi to tha memory, so tha structure controller doesn't need to 
store any tntormetten about the state of ongoing structure operations. When the result 
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packet comet beck from the memory, the structure controller looks et the entire packet 
including the tag field, decides what to do next, and produces a new packet to send beck to 
the memory. This method (the -memoryleM structura <^*Uei;- loathed) is eMdent, but it 
requires an extremely wide data path for all memory transactions end it fives r*e to very 
difficult problems of avoid ng doefock*. 

A second method Is to store sll of the state information In the structure 
controller. This requires that the controller have a memory wish a capacity of 209 bits or 
more for every structure operation that can be in progress at one time. In this case only the 
address of the block of memory jn which ths state inftteu^ mabxa4must beput in the tag 
field. If 296 simultaneous structure operates are allowed, the te* field only needs to *e 8 
bits. 

In either cms, commands to the memory contain • teg field. The memory 
ochoes the tsg beck to the controller in the result packet 

1.2.6 THE DATA AND REFEf«NCE Q0UNT FIELDS 

The contents of each memory word consists of a data field and a reference 
count field. The data field is further olvWsd into two pomter ttekl* leef-rM indicator bttt, 
perhaps a bit to indicate that the cett is on the free storage list, and perhaps type Indicator 
fields for elementary values. All of these ere significant only to the structure controller, and 
ere irrelevant to the memory. Thy memory can rimply c onsi de r thedeta le*e*e homogeneous 
Held In practice, it might be about 40 to SObittlong. 

From the memory's stsnopoint, the reference count JMmpjy pert of the date 
associated with each word. In some transient cam it might bacpejp negative in seme parte of 
the memory system, although the structura controller wiU rwver tee « negative reference 
count. In a typical realization, the reference count «e*d might be about * to 15 bite long. 

Incoming and outgoing packets that read or write a word of memory have data 
and reference count fields that correspond precisely to the fields in memory. 
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There is • partial order on histories: X £ Y if X is en initial subsequence of Y. 
For example: 

Ui3»4)l<H3|4t7> 

but (1 1 2 1 4) and (1 13 1 4) do not satisfy this relation in aithef order. 

Sinca histories only grow longer as tima progress** and symbols already in a 
history navar change, a history at a latar instant is always graatar than or equal to a history 
at an aarliar instant 

Tha langth of port history X is denoted W The individual packets of X are 

"It *• • • • "M • 

There is no defined time order among packet arrivals on different ports, so it is 
useless to represent them as a single sequence. Instead, a history array is used, which is a 
coHoction of histories, one per port. The partial order en histories can -be extended to arrays: 
A £ B if each history of A is greater than or equal to the corresponding history of a Uka 
histories, history arrays increase as time progresses, 

Tha dascription of how a system is expected to behave is quite simple. It is a 
description, for every input history array, of what output history array the system will 
eventually produce. "Eventually" means in finite time for finite histories. For infinite 
histories, it means that, for any K, the first K packets. wW be product in f init* tone. This is 
because a system which is expected to have an infinite output N»tory cannot ever transmit its 
entire output in finite time. 

A description of the dependence of output history arrays on input arrays is 
called a functional specification. It is s dascription of how a «y«taw is expected p behave. 
The major problems in the field of packet communication systems are proving that a system 
built in a certain way obeys a certain functional opacification, and proving that the 
interconnection of systems known to obey certsin functional specifications obeys some other 



functional specfffeetion. 

If, ter «ny Input *»*ey, the fwnc*«*ol spoctftcstto* states Ihet there is only one 
possible output array, the system k d eteiiwiiilt* (so iii tti i H e * catted functional, but that term 
will not bo used hero). fommmmmm*tom*nmm,mv%mm»m**1*fi wrays to output 
orro^ou*h<hit,trmp»t*tend^<oeja^^ bo 

produced. -Jf further inptft k than *w*vth. fopot Nistery 4« Y *«♦» Y £ «, and output history 
f(Y) wffl bo p roihwod, Smoo 1h» ey rt o m u n wot rotted owy of Its po a otoui output, f(Y) £ f(X). 
Ft^m1hte4t%-oaoyl0 'w o j1ho11 iK it«jwtente mW<it: 

X£Y*««>£sf4l0 

If there » mam 1han una tees! l ospsnsa te o yxon teput array, the system is 
nondetermjnste.. In tt«*«m etewttea *m»m*mmitomtm^^wp*Mk*\hfn,brt 
KX) k the sot of oil tag* outpdt testery ernes*, fuwm o m dohning the opacifications of 
n ondeterm l Mtts sytteoe ilstndboy »^igf i i» H^^ g tv, n later. 

ft m irooslbte 'ter m Hrtor um H oiDon *f nondetermtnate systems to be 
determinate. For owm? *, a dels floor u w ^dmi is e s tem iiia Jt s ever, ^though its erbitration 
networ* k not »n H oo ime . H«Ul»M of l a rto rmhote oy o teso, » always doterminste, and its 
function can oa tompu te d o wu l iim » hw iilte /tunaw e m tflhs 

2.0.2 Dt^ J WIP T IW ^PgaPtO^TWWS 

«neoa m i te* 1 o * Of tbs system test a, m i % to d o monotiiate that a system built 
in o certain woy obeys qortewi loncttenel sportftcoltens, & * nocospsry to describe in a 
reosonobly formal way how « system fc butt. A *«^ tfasrsm is em fcmnoliem, but it is far 
too rsjbJowd li uumii i iyitel i w i da p mid si ll . A hsjhw h»* msthsd b m a duiL When* system is 
sssombted from cw i tp o t w « t> , <H usmg *« padwt coimww . Uot i u n prtnetpte, H is of course easy 
to describe the IHlsimswdiw i, teUm* whit ports of ths vermes systems ire cormected to 
eachOHejr. ^ *)»teim Ihet twmefl b» « os^m^^ wiH be 

given in terms m m oregrmi. written In on srtre m si, mfarmol AtflBL-iiks tengusgo. This 
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language is • subset of the Architecture Description Language [10] which is under 
development. 

In the language wt will use for giving descriptive specifications, pockets will 
look like data records with a title and one or more, data floWs, for example: "WRITEO, 7)". 
This format is purely cosmetic In the actual hardware implementation, a packet is nothing but 
a collection of bits. The fields are limply divisions of these bits into sublets thet the sender 
end receiver both agree upon. The titles ere just encodings of another field. 

2J0 J3 AN EXAMPLE OF A DETERMINATE MEMORY 

A functional and descriptive specification of a system called MEM will now be 
liven. MEM is a random access memory with an input port IN and an output port OUT. Two 
types of packets may be delivered to it: 

WRITECeddr, data) writes the data into the given address 
READteddr) fetches the data from the given address 

The -edoV and "data" fields contain numbers that range over some finite and fixed spaces. 
There is one output packet type: 

RTWaddr, data) 

(RTR stands for "retrieve") 

Every READ packet delivered to MEM results in transmission of e RTR packet 
bearing the address and the current contents of the mem or y . Every WRITE packet stores its 
dote in the memory and returns no result packet. The initial contents of eeeb address of the 
memory is zero. 

For a given input history, the contents of the memory may be easily 
determined. The contents of each word is simply the data field of the lest WRITE pocket 
heving that address, or zero if there is no such packet. The function f^ reeKzed by this 



y is: 
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mWWdr, aata) * tha f* REAO(") fe X it fCNXaddr) 

«ajw»te»t wuiu»a»^**isjteis w«*flB*D 

* WMUfallr, Ma), If fear* It such • WHITE 


! 


te* Itew* it *» *ftn«M*,^«alBr* * 



ftetattens WfflTK»*o>,~ H »iawa awy WFHT1 a**la* t aj gtef ■ <»w sa *< If la* 
•nyH^ata"*1ha*Slif»ihl ■•• 



A 

f MEM » t*** **• **•* h *• <n ^ (lt N**** x *• 
history fgggOft. 



Iff ttetint «wt MEM r»»Hm 
te*,K wW sv s ntusB y transmit output 



TWs Mp M cH l BMHo ii says nothtu awpHeit soot* ftw states of MEM. This if basic 
property of tha history fanttts* approach to systam apaeiffcaliaA - avon for a davtco whoso 
purpaaa Is to hovs at a tti, t teft as a mamory, *ha JsnlfUatUrt *aai m mantkm the statas. Of 
couraa,tt»imi*»ry«teM4«Mfji§Atet,*^ Sinca 

tha taps* Mats? ■ i w s wk ***!»» la t ms t hm t*atfc» sv w | *a) *#b ths system, ft contains 
anouajh in fo rmatio n te aote i i i i i w a tha state. 



Ha *•*» tfcew Haw 1t» system MEM war %• ©uflfc tha system u«s a raal 
rancfem accost i ws m a ry, wtth toipottty of ons w«ri far a«* possfels vtiua of tha "addr" 
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field of incoming packets. We chooto some obvious correspondence between the values of 
fhe "eddr" field end word addresses. Esch word can contain any of the possible values of the 
"data" field of incoming WRITE packets. We choose some obvious correspondents here also. 
The memory is Initialised with aN words containing zero. 

The algorithm of the implementation of MEM is as follows: If a packet 
WRITECeddr, data) is received, the data field is written Wo memory et the word address given 
by the addr field. If a packet REATXaddr) is rswived, the words* the appropriata address is 
nondestructiveiy read, end a packet RTfKaddr, data) containing the data fetched from memory, 
is returned. 

This system may be implemented by the program which follows. "Memory" is 
an mrr&jf which represents the actual memory. 



erocoMfSarlisiA 

Input p ort IN 

ogB&esftpt" 

ver command, addr. data 
array memory init 

| wait for input 

A: until packet is available at IN dm 
command tm packet from port If* 

I analyze input packet 

if command - REAM—) than 

let command - REACXaddrfe 

send RTlVaddr, m»mory(addr)) at port OUT 
else 



m 

g{| t o w w Mw d » WW T (t§ddf» dtlaH 

BSSst 
Noteu 

<1> The statement* for receiving and transmitting packets art eafttiluety primitive. Slightly 

{■iliyiJ ynat^a* tejiife he* niMdta^^B 

(2) The exprawkm RTR(addf*ttta} mean* "• HT* parti! whew* fie** are fUI*d with the 
current vakm contained in addr and data". 

(3) The "-- in condition* hat ita ututl meaning, "£ peefcat • WmTWV-)" meant "if packet 
is • WRITE packet whose firtt fwW i« 3". 

W ""• 111 *•*•» * Otttero" •tatement it en eetignment ttttoment that eett the vri*bl«s 
tppeering in the pattern to have the valuea of tht cetr si ponding fi«W* of the packet, "tot 
thing - WRITRaddr,-)" mmm "» the type of ghjr& it net VWtTf^ w en eriren otherwise 
••» S&L *• **» «*» f W* «* |HJsft end ignore tht second field*. 

We now orovo that thit i mp l e m e nt ation aetitfiet tht •pecHietttor. t,^ . Firat, wo nttd to 
thow thtt tht memory state equate tht syttem ttttt (at defined by tht input hittory) under 
the following corrt tp o n d tn eoi 

For all X, tht contents of memory tddrtss X for • ghwn Input Nttory it 

«ero if tht input Nstory contains no packets WWTRX.--) 

Y if tht history dot* contain aueh packets, and tht last it WMTE(X,Y) 

Proof by induction on tht tongth of tht history at pert IN. For length zero, aH ceHs contain 
*ero by initializetionafnd tht history eonttint no WfttTf. packets at aH. Otherwise assume 
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trua for any history of length K and prova it for K+l. 



If IN^, - REAEX--), nothing w« written into memory between receipt of IN K 

•"*• If V.i » *° th » "w"nf »*«t» *J not change. The existence of VW?ITE(-~) packets did 
not changa aithar. 

If IN^, - WRITE(eddr, data), no mamory cdl othar ihan addr changed, and tho 
existence of WR1TE0C,--) packatt did not changa for X * : fdj& Jt«e contents of memory call 
addr is now data, and tha last WRITE(tddr,-) in the history fc» now obviously WRIIfteddr, 
data). 

Naxt, wa prove correctness of tha implementation. If tha input history - X, wa 
wiH show that f,^) will appaar at tha output. This proof is also by induction. If (X| - 0, 
f M€M " «• But th * Implementation spacifias no output axcapt in rasponsa to input, htow 
supposa X* - x,x a -. x^c^, . Let X - x,x t - x„ . By induction, fjaj^OC) appaarad at tha 
output whan X was tha input history. Whan x^, arrived, tha system transmittad no output if 
"n»i WM * WRITE* •'*' transmittad RTR(eddr, memory(eddr)) if x^, waa REACKaddr). 
Tharafora tha rasponsa to X* is 

fyo/X) concatanatad with 

« if x^, - WRITE(-,-) 

RTRfaddr, mamory(addr)) if x^, - REACKaddr), whara tha mamory 
stata is that left by X 



Now tfto/X*)! - Ifuo/XH ♦ 1 if x^, is REAIX-), which is tha langth of tha 
rasponsa to X*. 

Also, if X|M - WRITR-,-), f,^') - tynJX), and if x^, - REACKaddr), f^OO 
" 'lemr 90 concatanatad with RTRtaddr, z), whara z - tha data field of tha last WRITE(addr,~ ) 
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2.1 NONOETERMINACY 

Nondoterminate systems can take a wide variety of forms, and the problem of 
formalizing tho bohavior of ail nond»t«rminaU systesw is f sr too complpx to be4roatod in this 
thesis. Only tho types of nondstsrminscy that «■(«• in oonmctlon with th» ttructur* facility 
for tho data flow machine will bo treated. 

Tho principal typo of nondatarminacy that wW arise jn paclwt memory systems 
ia tho removal of tho requirement that the RTR packets bo returned in tha tana order ae the 
READ packets that gaya rise to thorn. For example, tha Input hWbcy 

WRITEd.l 1) t WRITE<2£2> j READU) » REATX2) could result in 

RTRO.ll) | RTR<2£2) or in RT*<2^) i RT*U f U) 

Tha system MEM is too simple to display tNs sort of nondstermiiMKy. For example, MEM 
would roturn RTR(1,11) as soon as it received tha first READ packet. It would *»t y#* >«©*" 
that it was about to rocoivo a sacond READ packat which would givo it tho option of 
producing its output packets in aithar of two ordsrs. Latar, wa wW exhibit implementations of 
syatoms which can maaningfutty taka advantaga of this nondetermlnecy. For now, wo will just 
havo to accopt that such Implamentations (that is, descriptive opacifications) exist, and 
oxomino tho form that tha functional soacificstion for such a systam might taka. 

2.1.1 FUNCTIONAL SPECIFICATIONS OF NONDETERMINATE SYSTEMS 

A nondatarminata systam can giva any of several legal output histories in 
rosponso to a given input history. Tha "function" defining the system's behavior is therefore 
multiple valued. One way to handle tNs situation is to trest the behavior of a system as being 
defined by a relation instead of a function. Tha method to be used here, which is completely 
equivalent, is to use functions whose values are sets Of output histories. For example, in the 
syatom f^ngu that we are developing, 
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inugpmmiM) i vmrnzm » reawi) , rcmk2» - 
{< mwu i> » fwtttMt* >,( nwewt) i RTRU,1 1) ) } 

ThesMuatmnmay irtMtWfWii empty for »eme X. %!• meana that X is not • 
v«t» input hietory, en* ttm behavior of th* system b unotfm a d This it different from the 
situation in which an Megat input gives riaa to a w a i t tt g fttiid *e*ref' 'response (packet) from 
the system. An "error* packet is certainty mere ossirabte than sayinf the system behavior is 
un d ef i eM but* seine ilto aha nt, such ss roeokm* acWiawttdgai for packet* that were not 
sent, are so withetoi l ni they tmjtt tmmhr Iw m rum t d not to occur, furthermore, at some 
leveta of detail m the o»acriptlbi»of esystom, It a conven ie nt to ignore error conditions if one 
con prove that they wont occur when the system ie hjnttt a ieng property. 



A function* description of a n and s term inate system is therefore a definition of 
a function which maps input histories into seto of output histories. It is usually most 
convenient to dascribo It Bae predteato dsflrmvj waft* hittoiiai are in HX) for a given X, and 
that predteato is often the topfcat AND of • masher of ether preetoatos, to the functional 



YismfWif 

»,<*¥* and 
P 2 <X,Y>atc 

The rule for realization of a function is as t ehsoa . A system mini f if, given input history 
X with KX) nonempty^ wtt e v en tu al ly produce a we s outpu t history m fflfjt 



The i ini tlpb vafuod functions r e al i sed by na» d at a rmiw a ta systems must obey e 
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NONDETERMINATE MONOTOMCITY (ND-MONOTOWCITY) 

If Q and P are input Mttorlet md Q <> P, than for 
any output history X in HP), if f(Q) It nonempty tharo 
it a hiatory Y in «Q> with Y fc X 

Roughly speaking, this meant that racaipt of a legal input tymbol will novor 
make tho system unable to proceed legally. The purpose of Jbe xptMfication "if f(Q) is 
nonempty* it to allow tor the ppetJHttty that an illegal it^ packet might make the system 
unable to proce ed, 



Wa can now give tha Aii^B^iapicJ^a^ 
NOMEH whkh can ar bi tr a rily mht RTR pacKott for different ed d ran a i . 



f, 



If X - input hittpry end Y -output history, 
Y it In f m(V fi() if 

(1) Y consists only of packets RTRt— ,-H and 

(2)F<>r.llt<idr,tr«numbvofREAO(»ddrr«mX-tha 
number of ftlWadtav-J's in Y, and 

(3) For aH addr an* K> the K* RTRUddr,-) in Y, if it exU*, it RTWaddr.val) 
where laat WRITECaddr,-) in X before K* RfcVUXaddr) in X 
it WWTE<addr,vet>»f «ach a WRITE(sddr,~> exkU, or vol - 
if no WRCTEJaddr,--) sxi«U before the K* REAWeddr) in X 



The system NDMEM hat the property Jhat Iht data returned in a RTR packet is 
the data in the memory (that it, tha date in the most recent WRITE cowmsnd eddrei t i ng thet 
ceM) et the inttant of the READ command corresponding to theRTR At the inttent the RTR 
packet it tent out, another WRITE command might have already beenreceived, but that (MUTE 
will have no effect on tNt RTR packet. 



input: WRITER MM &&&*& *E*B(A) 

output: «H9M» ««*£» 

*-*im» 



Jft **» totort fee tirt Wm . p a to l t w» rturnwd, » HWUL c o nwwnd changing 
ttw data #rom 1 *tef tail e>BBi) 'Illinium, -but ^HitiBaHh^^^yilr— that the value 
1 toftftumad. 






<¥S15M +1 (fdlatng if/Hof 



<3» Tata «— aage» «ut arf Hm totor a nd r g ton ihnw a» output pamarti at any 

i*aaato*a«ayawtoErM<>Je#ietaa laeJiiuMain that: 

lb) aton e wer »p«cMt h i»M » Mad, H wu»tto1fa» Ofctotm 

totoa ^g^gAa^aaT g^ggg^gggm d^^a^aeeei veaettk ^aW -aaaaaaa^ ^g^M t^Maa^af ^b 



fhe lea j teawwl e ftan . g>eii iwi iM my k tt thai opar t t lB m on <he mamory be 
iwtwtwMw, wo » ta not vary MU tectw H 4aetnlt teto aetontoje of tto etlay between 
• READ pacha* and tfe 1TO packet 1fc*t faauttc. %e <»*• In 1h» *TK pacfcat must be the 
cootewh) of tne n a wiem ma t* m »*** t*m*m*M**al *mk*. *• wouW like the 
system to to aUe <te <eae fee tohmef 1fc»*msi»ory werd at eeytotoJ toring the READ/RTR 
Interval tone <m«en aae tohr ef * eyatoit Ihat takes » a eh Wbarty: 
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SYSTEM e2 (purported realization of fang) 

(1) When a WRITE conMwnd comas in, write the word of memory instantly. 

(2) Whan a READ command comas in, put tha massajf REAIXaddr) in the 

Pandinf Road Buffar (PRB). 

(3) Take massif as off tha PRB at any tims and subject to 

the satne restrictions as before, namely thst every 
inassaga is eventually removed and tha buff ar is FIFO on 
each address. When the message REAIXaddr) i« taken from the 
Pending Reed Buffer, fetch tha data from memory and form 
e message RTR(eddr,datjX Sand the latter tfctbe 
Finished Read Buffer (FRBX 

(4) Take messages off the FRB at any time and in any order 

subject to the same restrictions as before, form a RTR 
packet, and send it as output of tha system. 



This implementation doe* joi realize f^^ .In the pocket timing graph after 
the definition of f^Biea, , the first RTR packet might have value 1 Of JUf < this impUmantation is 
used (The second RTR packet will always have data value 2.) 

We might like the system to take even more liberty, by performing memory 
writes, es well es reads, whenever it wishes. Such an implementation might be as follows: 



System e3 (purported realization of l^gaa} 


(I) When a WRITE packet comes in, put the message WRITE(addr l data) 


on the Peering Write Buffer (PWB*. 


(2) Same as (2) in System e2. 


(3) Teke messages off the PWB subject to the earne restriction* 


as before, and write the data into memory. 


(4) Same as (3) in System eZ, except that there is an additional 



reetffCtlOn that ne 
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mar be taken treat the P*»K • 



(!*> Seme m W l^tyatomag ' 



Thtohto fan* l» inea^t^^ . H»e«ver» earn Syetem eg and System «3 do 
r»«^» '.Mar* wMH eeshet tt ever sent to the »yato» when any flEAO/RTP treneections 
•nt in pregreee cm that word, tint frlwforoshflHlpiilMH w sent, • WTW packet must have 
been received for every READ >et*et tent addressing Wieiwerei Fortunately, it is not 
dWfkuR to guarantee that this >ss u i r e m si « to «eh ft If sfejafe a ietod%ta»iimiato functionct 
specMfeetton tor the "rmt of the world*, which we wll eel the ^awr*. 



Deftntttoni The wear of a tvetoai H that to wnkh the 

system cennacta, and is itsetf a tyatem. Tttoinput »•*• of 

the user an the eutput parte of the given syetaia, and vka^varsa. 



It would of m 
of 'lajMCU ^* •drhv "a aaaf* 
user of • syste m ah oo td 
restriction* can ge ne raey he 
function} just at the eyetoerfteett 



ho totaKy usetaas to require that, ht order for a realization 
"■w v mwawaHsae nensnonst spocfncMMm. in reci, ine 
av ^ fejwajitito' an Ha behavior as possible. Such 
by reejatrtnf that the user rasfa* some nendeterminate 
date. 1^ ^tno difference between eystom specifications 
but a mottar <rf dagf ae of ro rfr fc tt vo n sM. 



The 
READ/RTK tr a n sections 



that NDMEW* user not send a WRITE command when any 
ht progress can ba Mat by requiring ft to reahze the following 

f^ajaJkjeaa ii^dnnfBV t' 



If Y - input history of USER end X - output hfatery, 

(note the eachange of input end output so that X andY 

refer to tho same paehat streams In both the tystem end Its user) 



:T.;ii.^3«»j^!,-, ... »' f ^|S#-'*?!i>--«j»«% 
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then X it In ^^g^gmfaPO H 

(l)X<»r»W»on»yof p^»*»fiEAO(-)n^WRITE<~,-) 
(2) For eH odor, for any flRTTBeddr ,-) m X, ths number of 
REAOUddr)'* p r s ca s frw. it in X is £ the number 
of RTRtaddr, t-)> in Y 



The function f ^^^^ is easily toon to bt ND-monotonk. This is because the 
restrictions on the user's output X^ver become more ttritifent as Y Increases. As Y 
increases, the proposition "ths number of HsVUXsOaVys prssstana, It ftv* is £ the number of 
RTf*(eddr,--) , s in Y" novor goee |rom true to !•)••, s« ths ««t Of t«t«t »ray« )Nfoe* not 
decrease. (If tho y nod boon rapiacad by v, it would jwt J ss DPIwoiw toi ifc ) 



While system e3 does not by itself realize f^^^ * doee reefee ♦ W0MEM M 
connected to a user that readies t^^ To prove this, ths important stop is to show that oach 
REAtXoddr) packet fanarttas a fCTR pscMt centering data defined by too most rocont 
WRITE(addr,~) packet precedmg ths fivon REAOt aad r ) aatfcat M tha input stream. 

Lst t Q • tha ji^tant whM tha REAP(aa^.p«£kat oaflws in. Thar* may be 
pendine; WRITReddr,--) packets 4n J* PWi at t r If thsra ar • none, the mo«t recent 
WRTfHaddr,-) packat in tha input straam has stresdy passad out of tho RNB and into the 
momory unit, so its data is in mamory word addr. If there ara WntlKaddr,— ) packats in tha 
PWB at \ v tha most racantty insartad packet thara » tha m e at meant WRtTE(sddr,-) packat 
in tha input straam. Therefore, iatting 



°s*Jr (t > - * 



tha data in tha youngest WRITE(sddr,~} packet in tha PWB at time t 

if thara is such a packat 
tha contants of word addr in ths mamory unit if not, 



we must show that tha data to be eventual* returned in • Rfft packet 1s ^(ty Lat t, - 
tha instant whan tha READtsddr) packet waves tha PW Ftrst, we show that 0^(0 does not 
changa from tg to t,. Since tha READtsddr* p a c ket kadJahtosed the tystewt, it has left tha usar. 
Since tha corresponding RTfKaddr,-) packat has not yet been generated by tha system (and 
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wont be untH otter t,V it has irt been receive* by the user. Therefore, there is • READ/RTR 
treneection perking on eJeV, wtteuwitMl sending any W»iT«eddr,--) packets. 
Therefore, wh i tl o w WDIMIr) peafcet I* the p*B h yoiin go st w«l stay youngest m 
long o* it stay* tn the Pom few** m the*»oJ» any «P*T«feomv-> packets in the PWB, 
D,^ does not change. At tang at there erene W W WM ir^pett^ in th» PWB, D^ - the 
contents of memory, which eoeent change either, beeeaae on* removal of a WRnKeddr,-) 
packet from the PWB con shan go the serpents el m a a a * 1 / ward addr. 



There can be we trant i tt a ns from ne WWITEQJt*, -) packets in the PWB to one 
or more posheto, bec a us e the pee m wet aia a mg any. ThereMMr* case hi consider i* the 
disappearance el the iaat WWTOadaV-) packet from theWBf This packet is clearly the 
J******** •» ****** "W**** m i a j i peti e wn) -the data in the packet. This data is written 
into memory by rule 3 of the ha p ia m e wt ateen. w^jua* after dhj appear snee) - data written 
into memory -data m the packet that omeppaifoi Therefore fj^fy •ti f t#Jr Ct 1 X 

At item, t,, when the R€Af#ed»>) ptttet teeves the PUB, there are no 
WRrTSaddr,~> packets k» the PWB, by «*» 4 et the m^ i Witetfca Therefore D^y - 
( W , t > " *•"*■»** •* «*Miery wordajdr at t,. Bet when the REMXadtfr) packet is taken from 
the PRB, the memory were « reeaY andfte date gees into a Mm&fr-) pocket in the FRa 
Thet packet « these*** hWfsfre**^^ sneHsWe paaNet 'thef WtJ eventually be returned 
to the user. 



tins esnasasi somonnrssss s getwret principles ' 

Whether or not a given i mpt s m antahan of a systeat realizes a 
given function mey depend en whether Nat syotosYs user 
reeUaee eoaieother specific function. 



There is no way to get around this feat. Theroere systems that correctly 
realize useful functions (even completely determinate hjnctici»> w*e# connected to systems 
that obey certain rules, but behave in a totem/ piHw hj gk ai way otherwise. Furthermore, the 



lfc -»i I .« f s* * ^-^ : 
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system often cant tell whether the user has broken tha rides. In the cm* of system «3 
above, tha system would have been sola to teH whether • VW«TRa**r,~) packet came in 
whUo a HEAO/RTR transaction was pending an word addr, but in some esses the system has 
no way of Knowing whether its user it misbehaving. 

The structure controHer and packet me mo r y system for a date Now computer is 
such a system. Perhaps the most important example of the structure controller and memory's 
dependence on th* behavior oftheir user it the reference count and (arbsca collection 
problem. The rums that ths ueer (U. the date flow nomautir) must obey m ortfsr to assure 
correct reference accountini are as follows: 

(1) No pointer to a structure may be duplicated without giving a 

command to mcraaaa the rdfecence count. 
(2)rtocoinmandtodacTa*Mthar*f*raocecountmaybafiv*n 

unless a pointer JadmcefdaA 

These rules guarantee that the reference count for a node is at least as great 
aa the number of pointers to tha node contained anywhere in the computer. (Actually, the 
rules wiN be such that the reference count is exactly eoaabto the leenbar of pointer, to the 
node. However, the penalty for too Ma* a reference court is timpty 1hat a usehtes structure 
fane to be reclaimed and witfaememory space.) 

Now suppose the computer (that it, the etrochiro controller's end memory's 
user) violates the rule and allows the reference count to become too smell. Eventually the 
reference count may become zero while • pointer to the node stNrextete somewhere. When 
the count goes to zero, the memory system recWms the Rede end puts it on thecal of free 
nodes. 

Two possibilities then arise. If sn immediate attempt is made to use the 
"spurious" pointer to the cod, in s SELECT instruction for example, the structure controller will 
send e READ command to the memory. The memory will know that this is an illegal command, 
that it, that the user has violated its specification. It can than signal an appropriate error 



«2 



condition in order to prev en t the computation from ejwiag an ieaerreci result. 



B« OR theOJhor hone* tnlWr * 

the structure controller tebuad 

thoro is no way the memory emMtMi 'dililliii 

process the spurious command in the 

which in cAM^fek* UttfaMrf fr«M 



the fro* storage Hst and mod by 

by the ttWi the spurww pointer « used, 

H hot no choice but to 

referring to • structure 



in the bending of r efer ence 






hot m may to cheek for errors 
in section 5.0.6. 



2.1^Mm\lALC0NSJSTENCV0FRJNCTI0r*ALHEAUZATI0NS 



Su pp os e s system re aH s oi t^ lOnWnainl on tfr user reneging f tBBt , which the 
user does if the orieiam system r i is a iii - «^ « Ones ft fees* that ft* reattretions actueNy 
occur when the two systems ere ceramcsidm ees» after* » Spsaiibli tint they could both 
vWeto their i ps r i ficoti o m, with eorit Mojuat the other? Ut the systems be S end T. Esch i« 
the other's ueer , 



there must be e ftrtt instsnt of violation. That is, 
there is en instant t, when « Itret becenet true the) one teste* (ssy S) hes en output history 
which does net legefty foNow fro* its input history. There Is edetey, however slight (even if 
It is only the deley caused by pr op agat ion of electric currents through wires) in the behavior 
of S. Therefore Vt output history at tp d s e e nes oofs output Watery stghtty before t , at a 
time when T wot not w e Munet lo n i r e j. se S cannot Warns Rt et s t fuwcf ttn on T. Evan if $ and T 
both malfunction at pre c i se ly the see* instant, neither $ nor Thnews about the malfunction of 
the other at that instant, and so eetthe? anftmctmn eon be saiwad. H follows that, if both 
systems conditionseV obey their functional specifications, they wW obey their spacifkations in 
practice. 



***-.!f$>-W ***g£2i|''J«^' % %v 
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2.1.3 M0N0TO«CITY OF FUNCTIONAL SPECIFICATIONS OF THE USER 

We now |h« an •xwnpl* of how qs[ to cMHw th« fundkmd specification of • 
user. Suppose tho system MEM hat destructive readout, so that it requires that tho uaor 
rewrite any data that it reefe Suppose farther that far MatrNMn the same data must bo 
rewritten, and that it mutt ba dona JmmodMely, that it, no othar tranaaetiona may take place 
at sry address between the reed and ths rewrite. Hara it en sHempt at • functional 
apodffcatJon for USER Since USSR doesn't know whefttdeta to write untH tt receives the RTR 
pocket, we wffi require tho rewrite to ba a comequsnee of the RTR 



UBER 



Y - input to user, X - output from user 



For ett addr and Uf tboi* ff»Kado» existed Y «** RTRteddr,d«t.), 
then the 1* REARador) |n X « iw i dj i t sl y f ai la y ed in X by WRITE<addr<data) 



"Iq T W li iff > . i 



Unfortunately, this does not reouiro tho uaor to wait for the RTR packet after 
sending any READ, not sensing any more packets until tbs RTR errivec For example, the user 
might send 

( REA0(1) » REATX2) ) 

UntH tha RTRU,data) packet comes back, tha user hes not broken eny rules. 
When the RTRU,deta) does come back, the user will have retroactively broken the rules end 
be unable to do anything about it Since we would Hke to simplify as much as possible the 
task of proving that systems obey functional specifications, we need to make the 
specifications reflect the types of decisions that systems make in practice. It doesn't make 
sense for a system to perform aomo operation or emit soma result packet on the basis of an 
input packet ngt having arrived and not being about to arrive, so fy^,, , as given above, is 
unreasonable. 






fhe «e*sblee) h Hwrt * ygtp *« *»* 



;. fs «ee this, refer to the 



#*>« O-WWCM***) ^pgN«it(rte»] 



X-WBA0H)J«A«2)) feuteuttttorrf 



Now Q £#,* is in f ^P) <mi l^fljtD fr Mrmwpty (contdning, for example, 









N*«*tptf4» 



x«« 



Fer ^eodr»ndi,th«l M '«£A««ldr) m X,H it- eK^^i* 

if thei* i> *w f* WWUiit>,~) in Y«dtt it gTIKeedr^U) 
tat *>** 4heie% we-i* *!*•**,--) to Y 



•^•—^w^^ 



■-B'"^™^' i i miiji ' . | "^ ^^ i * | i | w i w'^iiBWi!WMw^^'^^wi^^ie^ 



YftfteB sift aft^udiltfi^^ftAfli I^B AiMAi^^riMAeM^ftiibt 



■ -■fjMrt*«J*W'ii- -**",-* 
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2.2 PACKET ACKNOWLEDGMENTS AND SAFETY 

AH of the systems considered so fir have bad to respond to incoming packets 
however fast they wara sent by their user, end there was no Hmit to the rata at which the 
uaar could sand them. In tha first implementation of MEM, tha memory unit has to accept the 
commands directly, and hence has to operate at unlimited speed. Syetem e3, implementing 
NDMEM, s ee ms a eMfht improvement in that it only has to put the commands into its buffers 
infinitely quickly, until one realises that unless the memory unit itself ie infinitely fast the 
buffers have to be infinitely large. 

This is clearly unacceptabMi no interconnection of speed-independent modules 
can make such assumptions. The problem is ona of safety. Na packet may be sent until its 
destination is ready to receive it. Tha safety problem arises at several levels in data flow 
computers. Hare we ere concerned with it only ft its most microscopic level. The solution to 
the problem is to acknowledge each packet transmission. That is, for each port transmitting 
data, there is another port transmitting acknowledge packets in the opposite direction. Every 
date packet must be acknowledged before the next data pocket can be sent on the same port. 
We will require aM ports of eH systems to have such an acknowledge port. 

(Even systems which would ba ssfe without acknowledge ports will have them. 
This is because of the manner in which packets are transmitted A packet transmission is 
indkated by a zero to ona transition of a "request" signal. An ack nowl e dg e signal from the 
receiver is needed to tell tha transmitter to reset tha request signal) 

The implementation of tha system MEM may be modified to acknowledge input 
commands only after the transaction on tha actual memory unit is completed. This wiH make it 
i m po ssi bl e for the user to send a command white the memory is busy. Of course, the output 
port must also have ackn o wled g e s , since tha system to which the RTR packets are sent might 
be slow and need to be protected against overruns on its input. So the algorithm for AMEM 
(MEM with acknowledges) might be: 

(1) If a WRITE packet is received, update tha memory (take your time!) 



: aft fife Am**! ^ B t ua * M ^^| A mhiH. 

•fffRpartat out 
(3)-tf «*»«olwew<i<|»» )»i » K^ cw th> wiliMir il iHtwH^i p ort, 



Thaaatferea 



'QPtaOTfVfnty K9a 



normal paeka^ ai#emt»fisn*adi* tt» 
the 



•nrf tha atrip* 



V 



lin ^jiMlmmn of aayrtam TH«t is, 
tWWforf awartV •§ tfeoufh thay woro 
«*<*•** input port X 
p«*t Y «nd lh» input 





**■» 




Mptffwtt^xtVji «HMrpwiF*VE^| 


<»m- 


nynbir af REM* ki X 


<2)Y,~ 


miiiisirpnifiPVfjrTiw r new? iff #* 




It HGflOta**) •*«» tatt WKm&oWr-* botora ft, « Dtot it 




ona, 1* Wtm9(«dtfr4^ «r **§*#» tnara It f* **TWaddr ,~) 




before Hi* i*«l# 


<3>PV' 


- |Y A | + number of mum* hi X 


(4>(X^, 


^•.■•^atli^' 


«>^swifg*i 


WW- 


*SP«#$W 



H it eeey to prove tnat the ajven HpHi i niiliWon rt o tteM part* (1), (2), OK w*d 
(4) of f^,^ . (It « wy aimHer to MEM) Part* (4), (5), «« (6» eorttfttuta the "Standard 
Acknowledge Restriction'' that we wW require j| •ysteww end f&uaera to obey. 
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Standard Acknowledge Restriction (SAR) - weak form 

If X ia an input port and X A is it$ acknowledfe port, 
(1) X A con$i|U onty of VK" 
€2> |X A | ^ PC| 

If Y it an output port and Y A is its eeknowJadji pert, 
<3>MS!Y A |M 

Oven that a system and its user both obey the week form of the SAR, we cen 
easily show that they obey the following: 

Standard Acknowledge Restriction (SAR) - strong form 

If Z is an input or output port and Z^ is its acknowledge port, 
(1) Z A consists only of "eck" 
«)|Z A KIZ|i|Z A | + l 

Proof: If Z is an input port of the system and an output port of the user, (1) and |Z A | £ |Z| 
follow from the SAR. on the system (letting Z - Xk end l&fc*^* fpttojw from the SAR 
on the user (letting Z - Y)l If Z is sn output port of the system and an input port of the user, 
Just exchange "system" and "user". 

The SAR. is clearly ND-monotonk and hence admissible as pert of s functional 
specification. .„ 

In any proof that a system reatitee a function, it suffkes to show that it obeys 
the week form of the SAR contingent on its user obeying the strong form. 

We can now prove that Ak€M reeiiies parts (5) end (6) of f M€M , that is, the 
SAR in strong form. 



Lot Y - output of AMEM mi Input to umt, X - input to AMD* mtf output of utor. 
Rr it, tmtmt of WBttb to x 

■ nwnoor or ocno own w* #** m corwooumco of f •? or mmomy twpwwwfnwion 

• |n j mimuur ur mm torn «n #t* m conMOWfKU or to? or mwom* mpioinomouon 

N»wM*mM*orotfttflDkft?X fry W of AMftft twpnmi»ititte«i» 
- |X| - numoor Of \*HI11» In X <oy wolt~oohtvo*iOM ft wor) 

AI»o JX A | - nuwbor of WWTi» tn X ♦ ff^l (dtrtvorfib**) 
£ nuMoor o< WOTIt tot X ♦ f¥f ffrtw ftAJfc for mm) 

• nuMtwr of NWTEi in X ♦ «**or Of Ptttt. Ml *«»&>•? Attftfe N w pHw wnt oiion) 
-W 

Tntt pfovot two wooX fOfwof tno 9AR* ffow wfAtft tno ftnMo, fono foUowt. 

1X1 CANONICAL *ACWIT COMtMACATK* 

Stmo tno StonAprtf Athnowttdft fttttriettOrf rtofrAwfy Irmito tho woy 
•cRnowioafo pom oro rwnowo m mo funcMnO) •popnconon o? § tyftom, h it not uncommon 

Ia> #tiA I^M^^A Ohf CftOm ^^OA^»Ak^^iOO^^A OMboA^ ^M ^^0> ^A^mPA^^L.- Oi^^AA^O^I ^*l 4Aaa £^^^£^^^^^^^01^^ *-* |L_ 

»wr ttw nvfioHn^ w? ifw ^nwww^v ^9ri9 w ^» pmnny mvm in ww ifnpwriwrn vnon or i n* 
■jrnwn. wwfwrwvww poasiwi^ •ywvHt npnMnnm ww ruvonrv ano wmwmi p uiw n m uw 
following way: 
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Canonical Packet Reception (RCVPKD 

(1) Wait until a packet has arrived on the input port <it might have already arrived by the 
time this step is executed* taka its data 

(2) Send an acknowledge for it 

Canonical Packet Transmission (XMTPKT) 

(1) Send the packet 

(2) Wait for an acknowledge 

These operations will appear in the system Implementation language as 
"functions'' that take port names as arguments snd appear in assortment statement*. The data 
conveyed by the t- is the contents of ths packet. Assignment statements containing these 
operations ere like input/output operations in ordinsry computer programs in that thay "hang 
up" tha program until the packet communication has takan placa. "Var :- RCVPKT(port)" waits 
until an incoming packet has arrived <and than acknowledges same). "XMTPKT(port) :- 
expression" waits until the transmitted packet has been acknowledged. Programs may use 
multiprocessing as long as no RCVPKT or XMTPKT operations can be simultaneously executed 
by two processes on the same port. 

It is aasy to saa that any implamantation using the RCVPKT and XMTPKT 
operations obey* the Standard Acknowledge Restriction. 

Systems need not use these canonical operations in order to be correct. For 
example, the implementation of AMEM given previously did not. That is why the proof that it 
obeyed the Standard Acknowledge Restriction was so complicated. 

Hare is an implementation of CMEM, a system whose behavior is similar (but not 
identical) to AMEM: 
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process starts at A 
input port X 
output port Y 
array memory inrt 
var command, addr, data 

A: command :- RCVPKT(X); 

tf command - REACH--) then 

|et command - REACKaddr); 

data :- memory(addr); 

XMTPKT(Y) :- RTR(addr,data) 
else 

let command - WR!TE(addr,data)t 

memory(addr) :- data; 
goto A 
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Z3 LATENCY 



CMEM and AMEM behave differently in a subtle way. Suppota tha uaar 
transmit* a READ packet and than rafutat to acknowledge tha RB? packet that result*. AMEM 
rafuaas to acknowledge tha original READ, and tha entire system comae to a halt, tinea tha 
uaar cent send another command packet until tha previous ona was acknowledged. CMEM 
acknowledges tha READ packet anyway (It happen* automatically » part of tha RCVPKT 
operation). It than rafuaaa to acknowladga any furthar command packate ontM tho RTR la 
acknowledged, bacauaa it gats hung up in tha statement "XMTPKT(Y) :- RTR(addr,data)". 
CMEM behaves a* though i\ he* an Input buffer cspobls of storing ona packet. 



Thia difference ahowe up in tha f uncttenal apaeif teatten. Una* 2, 4, 5, and 6 of 
tho specification of f^gf [section 22] apply to CMEM ema. Una* 1 and 9 are afferent: 



(1) |Y| - number of READ* In X 

O) Py - fY A l ♦ W - number of READ* In X 



CUM. 



<1)M 



numbor of READ* in X if |X| - or 1 

of < M £ 2 and Ity £ numbar of READ* in (X - last paekat)) 
numbar of READ* in (X - last paekat) otherwise 



<3) PC A | - < 



pq if p(| - or 1 

or ( |XU 2 and |Y A | ^ numbar of READ* In (X - last paekat)) 
p(| - 1 otharwiaa 



This illustrates tha fact that corract analysis of tha latancy of a system can be 
quite complicated and requires careful analysis of tha algorithm. 
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the onfy 
acknowledge «H HVR 
brthAnCMendObKht, 



WHerence baH ww n AMEM «nd QfcKM 



if -the user fails to 
can easily show that, for 



fVj * Man » n »f*B*9Mn X 

<To prove*** h*- 0»«K»km fh^if ft| ^ 2, t»»^» JYJ < number of REAOs 
in (X - tastspsthsl) teiirt iussuk) 



The tetameyef «<*eete*>1t4he 
acknowledge whose <-naeiMs -*■«• °net town 
pending pa nmwwte lhot 41 
the 



ef commmrtft that It can accept end 
% the seer; that is, the number of 
*#mmmm^*mrmtto1h»k behavior, 



One system tor which it sen be dehned is «• «Vfc\«r «rst-in-Wrst-oot buffer. 
A FIFO Of length W ( en d ' ha*h% latency ft) » a system with one input port end one output 
pert-, which i s at is o i 'W ei < Ut e-wmiy -: 1 f4meW s m <w«ii ^ art ini i ii" i i nm s i ii e -^n>^ej ^<Nisln^-%iiiM^ai ■-fNsm : ftsi - twwr 



ThefoTw^sn w al liua bye WflO^le i e^ hlfts: 






A FPO Of length "N fe 1 can be Ini e l awi nwil ^etth e queua of sbra N and the 



rottowing 



processes start at A. 1 



output port Y 
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varm 

viLPiQiiO | oueus population 

A: ynti[p*Ndo, 
K :- RCVPKT(X)j 
store k at end of queue; 
p :- p + 1; 
goto At 

B: until p *Odtt 

m .— item taken from front of queues 
XMTPKTOO :- m» 
p :- p - U 
loto B 

For N - 1 this becomes: 

procos t starts at A 

output port Y 
varP 

A: P .- RCVPKT(X)| 
XMTPKTOQ :- Pj 
goto A 

A FIFO of latency zero cannot ba implemented by any system u$ing tha RCVPKT 
and XMTPKT operations, though it can ba implemented with'a few pieces of wira. 

Appendix I contains a proof that a series connection of FIFO's of lengths M and 
N yields a FIFO of length M+NL 



W 



When sys to mt dtftor emy In their Money, It « temot tom t possible to make them 
equivalent by adding flfiQPt to verleut porta, ftor ex a m p l e , ft «en be shown that CMEM it 
identic* to A*CM with * FIR) Of fc W Ojh ana tot J o* top* » tt eoaM be shew* that every 
system X It aouM too t, e xcept tor l ate ncy, to t tyoto m \ d a fintd m towing latency tero, then 
the latency of toe system X tewld be ihsmt aiw sd if too HwM»w el the FffO^i that would 
hove to be added to the vertoet potto of X^ to make It tomdicwtofcAtyetomof latency zero 
woura neve le so one onsen never •Qwowejojes soy wpox ^wooot enm en resuntng outpui 
psciwts hava bsen ssrt and attomwtoapsd. AMEM It such a if itoit , so SMB* couM be said to 
have latency 1 on Its Its input pert end tore on Its output port. It to not ctoer srhsther such 

eW» onWyW to^W ^W fla^P^P^^W ^P ▼o^WOT^eW^Wfle^W toj^Wo^^no ^PT WJ^nWajeflaW Wm^^Wfjr « 



23.1 ARBITRATORS, mm/Mm, AND ALLOCATORS 



Three betk systems ere very Imp o rtant to too design of the structure 
controeer and memory, at stoR at other placet In i 



Tne ejjpjrabjr It a 
tf antmfto each toopssng pocaot to Ha) 
be preeorved to too output stroom. Itoj 
pons w stem my. an any 
amvea nrti. mh eronreior 
by a s u p ers c rip t i nato ai of a subscript; 



■Aki^akBHamak M^JflttW- mat 



one one output , wrwcn 
rom each Input mutt 
peanoie from uhiuioih 
on wrwcn inpui oecnei 
port number it indicated 





basic (zero tetoncy? erbhrstor ***■ 




If X 1 , x«, 
(X^XJ,. 


. . . X* are topott and Y is output, 

. . Xj Y) € f^ {x« t x* . . . X* Y A ) P 
N 




COW-i 


1*1 




WVIctW^I-nuinberofpec^ettfroiw^mrratfrj 
ffl¥i<{l|ll1ftffl«pr\thtiaouaiwa<X^,<i > x;>,... 

It a tubiaejuanca of Y. 


petkett of Y 
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Each Incoming packet i« tagged with its port number to that ita source can bo 
identified in the output. Thia identification feature Is used in a few, but not ail, applications of 
the arbitrator. 

Arbitrators are the major component ojf the arWtratton neiwor K of the data 
flow computer. The principal use of the arbitrator in tba structure ipemory is to allow the 
address space to be divided into small pieces, with a separate memory module handling 
transactions on each piece. The LOAD packets Mini beck from the several modules ere 
merged in en arbitrator, so that the entire interconnection of modules behaves as if it were 
one memory system. 

Arbitrators of nonzero latency may be defined as zero latency arbitrators with 
various FIFO buffers on the ports. Such trbitrstors art useful in Jgffjojit, places throughout 
the data flow computer, but there is one place where the arbitrator must have latency zero. 
This is in the transmission of packets from the structure controller to the memory. When the 
structure controller receives an ackmwtadje for a pecKat it has sent to, the memory, it must 
know that that packet is ahead of any other packets that might subsequently be sent to other 
input porta of the arbitrator on that memory unit. This problem will be explained in section 
SjOA 

An arbitrator of zero latency may be realized by the following program: 

process starts at A 
input ports X, ...X^ 
output port Y 
ver p, input 

A: wait until a packet is available on any input port, 
let p :- that portj 
| this is nondeterminatel 

input :- the packet on port pi ) do not acknowledge yet 

XMTPKTOr) j- <p , inputs 



s« 

••no scnnowioooo on port pi 
toto A 

A OWtbirtor ft • deterMfetete system wMft one Input «ntf N outputs, which 
transmits intominf peefcott to the output port so to tt od by e data field in the pocfcot. Incoming 
peenete ere ssmmod to beef the form «port, date*. Tht dMribvtor strip* off tho "port" fiotd 
m ww ffriow mw. mn wumjm QPtniDuior rMMsfi WW flowing fynenoni 



*«-MM«WiiMiNiiaMbMMIiHK«MMu«HriMHaaiMulMli 



MMteidMWiMMWuhiUM 



"■^••w ^e^oW^P ^o^^Pf^Syy ^^Vlveo^on^PT *MMA* 

H X is input end v\ Y* . . . Y* ere output*, 
<Y» t Y*. . . . V*, X A ) C fgflflt, Y^ Y* . . . tj> if 



(t) VJotlW, p^ • iNJMoor of pacitots 4^* to X 

f •■! 
W ViVj, Y**date#na^j*pe*tet<~»in Xia«i,dot«> 



aajauiMoiiiaaaiiiii liBinMiflh* 



MMMMdBWI 






promt ttlnf fl A 



Y Y 



A: 



weft untH a packet is evaneble on port Xt 

1 1» tho packet on port Xf | do net 

\ft t * «port , detiPi 

XMTFKTTY^) i* datoj 

••no acHnowwopo on port a a 

eeto A 



yot 



Hither Money dfttrtbuters may bo defined in terns of baefc distributors end 



FIFO buffers. 
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Distributors are the principal component of the distribution network of the data 
How computer. 

An allocator is a nondeterminate. variation of a distributor which transmits 
incoming packets to one of several output ports. Each packet is sent to any output port that 
is ready to receive it, that is, any port that has acknowledged all previous packets sent to it. 
An allocator is normally used to send packets to a group of idjir^cal units* always selecting 
any unit which is not busy. The structure controller of a data flow computer will typically be 
realized in the form of several identical units in order to increjw throughput. Operation 
packets from the instruction cells will be sent through allocators to the structure control units. 
(In fact, the other functional units of a data flow computer will be hendfed the same way.) An 
N-output allocator realizes the following function: 



basic (minimal latency) allocator f 

If X Is input and Y 1 , Y 1 , ... Y** .r, outputs, 
(Y',Y*...Y N ,X A )cf MU)C (X f YiYj...Y;)Jf 

N 

1 - I 

N 

(2) |X A |-mln{|X|,N-l+ ^ |Yj( } 

k - I 

(3) Y 1 , Y 2 , ... Y* are disjoint subsequences of X 

It may be implemented by the following program: 

processes start at. A, B 
input port X 
output ports Y 1 . . . Y* 
queue q size N init (1. 2, ... N) 
var pop init N 

A: wait until a packet is available on port X; 



ALLOC 



z :- th» pocfco* on *x>rt * J <# «pt •chnwrteds* ywt 

k :• item «t twos' Of* 

PIP «»'PBJP ** 1| 

•on* p«ck*4r on pari Y*i j^teo* matter »cKnowt»da< 

t£Maop*0«& 



w;w wni wvwmM^ ft nWM OH fPJf pPfl Ti f 

tot f> i» UmI part] 

iaito tita «Hm»wMp» from port ¥jj 

put p 4tWH$ Of p| 
P*pt»pOP*lj 



Tho baafe ai l a cat or § Ivan abova doos not lip* latency soro in tha aonso of not 
acknowledging any topi* <*** tfej rasottant pptput *m iwoo aannowtedfed - *uch an 
nim|WMM moms omaai W •POMlprTi purpaaa. iwo 'pyptaM pman nostw doas novo ttw 
minimum latency that 
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3.0 THE BASIC MEMORY MODULE 

in this section s formal specification of the ntsmory module "MM" will be given. 
MM is the fundamental bulking WocK oj ths Dscliet owmory system. Esch MM system* a 
memory, somewhat like the system NDMEM described serlier, which handles • specific sat of 
addresses. To increase total information transfer rate, the address space of the entire packet 
memory system may, be divided into unsHsf pi***, with ojp MM unit twndHng each piece. 
The MM units are cwwwctsd thrc^ arbHrstcrs and (^irtOMtprs, ai»d form « tyttam which is 
itself an MM This is Tiorfconte)" composition, and is quite similar to ths interleaving found in 
conventtonel memory systems. To incrssss the speed on tadMdual transections an MM unit 
may have e cache module ,!B|f connected to it MM with CM connected to it is itself en MM, 
TMs is "verticel" composition, and is quite similar to the cache memories found in high 
per fo rm a n c e conventional computers. 

MM has one input, port CMDI ("command in") taking commend packets from its 
user, and one output port RESO ("result out") returning results to the user. The memory 
space is divided into words or c ois (ths tsrms wtt be used Werchangsbly), asch of which 
corresponds to one node of a structure. Every memory transection refers to one word, end 
every incoming or outgoing pocket bears the address Of that word in its address field. The 
memory space is the same siie as the addreu space, sod tbeetep is known to the user, so 
there con be no "nonexistent memory word" error. In most implementations, the memory size 
would be 2" where the address field ef every packet is N bits. 

Notetion: FET*** means any of FET, FET, or FET + . L0AD c * ) similarly. 

Each word in the memory contains a data field and a reference count field, 
which ere used by the structure contrpHer as described in section 1.2. L0AD ( *tend UPD 
pockets hove corresponding fields. Furttier mors, rll" 1 ** jwckels Have a tag field, which is 
returned unchanged in the corresponding LOAD*** pocket. 
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3.0.1 LATENCY AND INITIAL MEMORY CONTENTS 



The apeeifisation of MM to be p>en betew 
Thit it beceuee MMft ueer tt required to stkn o wtodgo 

n^ipvns, mm wm knw*<M|I wwy CQfflntOTB pacnm, 



dees not My anything about Money. 

dwory result packet. When this 

rt fr d MM Of whet Hs actual latency is. 



Initial metier y contents wilt ttto ho toft unspecified. In tho functional 
specification of a maonry, the definition of iflHW contents arises in tha spadHcstion of tho 
system'e reepenee to a BEAD command that wm nat pracadad by s WWTE. Tha specification 
of MM «Ht aesuett that Ufa doaa not occur. In air actual data New ooiaputor, a f rao storage 
Hat wdt bo gonorotod whan tha system starts, which rao^lres wrthng on ovary ceil. 



3J0.2 INFORMAL iEHAVfOfl OF MM 



There are 5 typos of input packet* to MM, and « types Of output packets: 



FElXaddr, tag) 



\ Twcn / rasas me eaaressoe wana-ano 
LOAMee*. eats, rtf, tag) 



t f^f W I"* TW9WfWfKm WWII J 



FET+Ceddr, tag) 



i ncrc eo w tha reference oaunt by ono tnd returns 
L0AD*(addr, data, rof, tag) 



fVaf" it tha rafaranca count aftpr tha increment] 



FETCaddr, tag) 



docfoaaaa -tha reference - count by ona and ratums 
L0AD"(adaY, data, ref, tog) 



CLTKaddr) 



("ciaar") waits until aH FET/LOAD, FETTlOAD*, «nd 
FET/LOAD" transactions en tha indicated word hava 
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completed, and than returns DONE(eddr) 

UPTXaddr, data, raf ) ("update") writes tha data and reference count 

into tha addressed word It return* no result, 
and hence uses no tag. 

MM is nondeterminate as was tha example memory NOMEM, in that result 
packets referring to different cells are not constrained to, appear in the same order as the 
commands that gave rise to them. MM is further nondeterminata in that it may rearrange 
LOAD*** packets referring to the same cell. Such nondaterminacy would not have mode sense 
for NOMEM, since RTR packets with the same data and same address were indistinguishable, 
but, in the case of MM, LOAD*** packet, may have different tegs. 

Since LOAD*** packets involve a change of reference count end may be 
reordered arbitrarily, the question arises: What happens to tha reference counts appearing in 
such packets if they are reordered? The answer is that tha result packets have reference 
counts consistent with their own order, not the order of the original command pockets. 

Example: Suppose the reference count of cell A is I, and the command sequence 

FET*(A, Tl) } FET^A, T2) i FET(A, TaiiFETCA, T4) 

la sent. Some of the possible results are 

LOAD*(A, D, 2, Tl) j LQAD + (A, a 3, T2) t UQATHA, D, 2, T3) i LOAO'CA, .0. 1, T4) 

or 
LOAD"(A, 0, 0, T3) j L0AD"(A, D, -1, T4) i LQAD + (A, D, 0, T« r!£AD*<A, 0, 1, T2) 

The reference count temporarily becomes negative! 

The reference count appearing in any LOAD* packet is one more than the count 
in the preceding LOAD**) packet. Similarly, tha count in a LOAD" is one less than, and the 
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count in • LOAD is equal te» the count in the presetting LOA©***. Seme impmnwntctions of MM 
will never reorder LOAD*** poc kets i ston i ng ftr tfe j mm wH t mn^ although they may reorder 
thoto for different ■dlbaim, If this is the esse* the* im f e i o nio count will rwv«r become 
negative, which rems#s»nw tmed for s sign bfftfr the i oformwie count field. 



3.0.3 informal ro*vfoa of mm* user 

When the user gives s&ft cotrnnand, it Must not tend any further commend* of 
eny type for the In di cat e d soft wfW the UJ) I ' llumoJHj FJBNE seeks* *>•* returned. (The 
purpoco of the GUI com n mnd It to deer out comfng h wwrtftm * would defeat its purpose 
to- continue sonolr* camiimndtj 



Like NOMEM, MM r» e oir s » the* no UW commend be given while eny 
transactions ere pending on Mm tn d katod celt 

3&4 fowmal wmn mm mm we MMwew 

Trmse dofinWom d* not chow latency # man* eny reference to acknowledges. 
The user is required to acknow l edge every result pocket and MMta comeauently required to 
acknowledge e*ery eOmmondt Both systems of course Obey the Standard Acknowledge 
Rectrktlen. The de HnWons do not considsr tnr pocmbwly of ifmge* pocket types or invalid 
fields in pockets. AH unfceftef quantifier* em hitmwieU to range Over a smt that is in eech 
cose obvious from content. 

Note: in rums 2, 3, end 4 the aeroth DONE in Y mesne the beginning of Y. The 
N+l* DONE in Y, where M - the number of OONEt til Y, nment the end of Y. SmUlerly for CLRs 
in X. The intention if to let the DONE and CLR pewmm break up X end Y Into intervals, which 
mckec it convenient to think of the entire hlsterm* ss being precedsd end foTiewed by DONE 
or CLR pectoris. 
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lamf 

If X it input and Y is output. Y c f ^fK) if 

(1) For all addr, tha number of DOfCUddr) packats in Y - tha numbar of CUKaddr) 
packats in X 

(2) For aN addr, K, and tag, tha numbar of U)A«addr,--,~,tag) packats batwaan tha K* 
and kM* OOhfladdr) in Y - tha numbar of FETUddrM) Pffkota batwaan tha K* 
ardK+l^CLWaddrtinX 

(3) For alt addr, K, and tag, tha numbar of LWadoV-^ag) paefcata, batwaan tha 
K* and IM* DONRaddr) in Y - tha numbar of FEnaddtfag) packats batwaon tha 
K*andK+l*<XI*addr}inX 

(4) For all addr, K, and tag, tha numbar of U^addr # -yM«tJ |M*kat« batwaan tha 
K* and K*l* OONEtaddr) in Y - tha numbar of FET%ddr,tag> packats batwaan tha 
K lh andK*l ,, aH(addr)inX 

(5) For all addr, J, and K, tha J 1 " LOAD^'Uddr,--,^--) in V is 
LOAD ( * ) <addr,date,raf+D,"), whara tha last URXaddr t ~ r -> bafora tha J** 
FET<*>(addr,--> in X is UP(Xaddr,data,raf) and is pracedad by I Ff T < * > (oddr,-) 
packats, and - {numbar of LOM)*(addr,~,~,~) packats} - {numbar of LOAD" 
(addr,-,-,-) packats} among tha M* to J* LOAB^addr,-,- r ^ packats In Y. 
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if ¥ # input #p uaar and ..X 4s otrtotdrit * ! ^wmkm&® ^ 

(1) For oil addr, attfear the nu m ber of CUKaddr) eecfcats in * - the number of 

JI9MW!lo^#aaAMlMR ¥» or oJae 4liefe toonoHaewe -'QbMiddf^ jn X tften OQNE(addr) 
to Y, and ttwe ere no #sef**He**v^ or mWicrt^ #K*ot* after the last 
OWeddrJInX 

(2) ft* eN ader, for «ny Uf*KadsV,-,~> in *, tt» *»eber *f #rr**Wdr,~) packets 
eeeeadtof 4i to £ tna nwmber of LQAfy*Wv~,-y--> »«*»♦■ in V. 



3.0.5 



Imp l em e nt ation -of MM arfHi » ran do m eccaas Oa vi o s to ouBa easy. Assume the 

■aoMaMMPW ift ftioadh sfteMftaaus) Otta^sah^ftodt^ eMBtfi ma^l^o! ^^^gA^i^l—^ *AbVa ^^>m ei^^ — — * — — — — * fnr 

nmmry w ww wr^rv* fflPWOsM -jmb wpwrfpfa arapmflg Wm mmm mm twWCwNGM CQUra fOf 

eacn were, aeafmeweiy . 'i^eMOiaing a* agrom wnanfnco: 



process eterto ft A 

erroy m am da t a, mo m raf 

ver commone* aoor« eote, ref, teg 

As conwMMd »• R^WCftSMODi 

if command • FET<-«,~) thgHn | FET -return LOAD 

{ft command - PETteddr, toj h 
XMVPKTWE80) o LOAtXsddr, msm- oateUd i k ), iwam^Kaddr), tag) 

etoe If command - rTT"(— ,— ) thgn \WT -4tocroment rof and return LOAD" 
tot command - FET"(eddr, tag); 
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mam-raf(addr) :- mam-raf(addr) - 1» 

XMTPKT(RESO) :- L0AD"<addr, mam-data(addr), mem-ref(addr), tag) 

tlM H command • FET*(— ,— ) JJjfJi l.FEt*, *. incrtmant raf and raturn LOAD 4 
lat command - FE^^addr, taj)» 
mam-raf(«ddr) :- mam-raf(addr) + 1; 
XMfTPKTWESO) v LOAD + (addr, mam-datafaddr), mam-raf(addr), tag) 

ajsa If command - UPtX- -,--»--) than | UPO - updata mamory 

comm an d - UPO(addr t data, raf h 

mam-data(addr) .•- data; 

mam-raf(addr) :■ raf 
ajsa | CLR - raturn DONE 

jat command - CUKaddrH 

XMTPKTWESO) :- DONHaddr^ 

goto A 
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3.1 HORIZONTAL MfBlQQMflCTfQNt OF W SYSTEMS 

Tha fcmetie»ei ips cH l cstwm of MM end** user have the useful property that: 

(1) fyy end ^immnt ore invariant under reordering Of commend packets 
referring to aVferent words. TM is, such s reordering wifl not effect the 
legal r as pen i ei from MM, nor vol II effect ths legaMy of the commands 
from the * 



(2) f MM and f,, ir . ara emmmiy Invariant unoer reordering of ratult packets 
f of erring to deferent words. 

<9) f,^ and * l-i-B , ow invariant unaV re o rd e ring of iM^ packets far the 

***** *o*d boteajon any pair of (SOME pesfcets lor that ward, ssiuming the 
rafarawca couatt era watabty ml iurt a d. 

(4) the behavioral properties of MM and it* user are cemptetsly indapandent 

#afca" jJtfia^eiaMB^eaV iua«^* 

Property (4) make* it possible to connect MM systems and their users throuf h 
distributors end arbitrators, and stW have an MM system. Hie feeowine, connections ur9 
possibie: 
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Muitiplo memory connection 



CMDI 



RESO 




If each of the smalt boxes realizes f^ (contingent on it* user realizing 
'mmuboP 1 "* '•*■• ***h$4 ^ 0X »••*«•• 'mifii * 0r * ••hl« r eddreW apace. If the user of the 
large dashed box realize* fmajm, , each *maN box's* user r*^ 2 " * mmuser. * 

For this to work the distributor and arbitrator must handle address fields 
properly. If there are 2* smaR MM units, the address fleW of the interconnection is N bits 
longer than that of the units. The distributor picks out N bits of all incoming address fields 
end usee them as the output port numbers. (For interleaving purposes, it might be most 
effective to pfck out the least significant bits.) Those bite do not appear in the address fields 
of the packets that are sent to the MM units. The arbitrator insert* the input port number of 
each incoming packet into the address field in the same positions as the bits that were 
removed by the distributor. 



This connection is one of the methods by which the transaction rate can be 
increased. Random access memory devices have the property that every read or write 
transaction causes the device to become busy for soma period of time, during which it cannot 
handle any other transactions. For exampw, a MOS RAM might be busy for 500 nanoseconds 
during every transaction, end therefore be able to handle 2 million transactions per second. 
Putting e FIFO buffer on it wW increase its latency (as the term was defined previously), but 
its transaction rate stays the same. The only way to increase the data rate is to use many 
memory units. If a distributor can hand* 64 million packets per second on its input port, and 
an arbitrator can handle 64 million packets per second on its output port, it might be 
reasonable to use 32 MOS RAM's, each in a separate MM unit These are connected to e 32 



port distributor end • t£ pert arintartar. 'Has swaraji rote at which packets come out of 
each port of the d M i tti uhx It 2 edition par aaeeaJlalMa* w the rat* at which individual units 
can hendlo thorn. -mmuimtw, the eommsnes oao aulfeeeily dhflwomaoil '9Mar Urn .addrass spacoi 
this intmconnoc tion adN hand* •* adlee tmeeeJMieB for aeeond. fhf *etr4eveJ dalay for 
oach Mom w4N etitt ha 190 nawosacowd s, hot Mad 4a ae aneMOadaMe consequence of tho 



For thk i ntsuo i smct ion 4a wot* ■Hooh , naV tha latency of the individual MM 
units, or the output latency of 1ha omh^hMhK, mast he at ieeat one, and pfeforabty mora. If 
tha MM units end the dtetrnwrter eH -have Meaty mm, the distributor will he unable to 



-'■ e' • 
comptotety processed fey the hhi mitt. Ma 'waatdiiatoat 'Iha paapoae of osme, multiple unit*. 



In practice, the lat en cy adjht *o tamawhat maw Ham mm, hi faster la me Jnts i n a transaction 
rote near Mm msnmemt « *» oeeaeejQa of mmmemm»dh|lh|h^ huwauioy of commands for 
each unit, few aen he sesomaoabsd by jesting a ema* TWB h a tha b e tween ttw distributor 
eno eecn mm tmn. 
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Multiple user connection 




This is just like tho multiple memory connection, but with the roles of MM and the user 
exchanged If the solid box realizes fyy , each of the interfaces at the top Of the diagram 
realizes f^ for a smaller address space. If each of the users of this interconnection realizes 
f M*JSER »■.*•" *"• MMMtton of aH of them along with the arbitrator end dtetributor realizes 
fuui-. on the large address space. 



As in the previous case, the arbitrator must map the input port number into a 
larger address field, and and distributor must remove the corresponding part of the address 
field and uee it as the output pert number. Each of the interfaces at the top of the diagram 
realizes an equivalent address space, and each uses a different subset of the memory space 
contained in the actual MM unit. 



This connection would be used if there were several users, each presenting 
commands at such a slow rate that one memory module could handle all of them. Such a 
situation could arise if several cache modules are used which have a sufficiently high "hit" 
rate that the rate of memory requests arising from cache misses is low. 
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3.2 VERTICAL COMPOSITION AMD THE CACHE MODULE 



In tho toeMon w tomtom &• caste mmMo tJM* wMdt connects to an 
tystotn end) to connocwdt rMUM on MM oysoMi volt ipo ■ 
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a^ d± 




The purpose of a each* it to retain tha data of a small subset of the main 
memory's addraM space, and return requests for data in that subset directly without reading 
it from main memory. Since the cache has much less data than the main memory, It can be 
buNt out of faster circuits and devices without being proWbitrvely expensive. Hence any 
request for a datum that la m the cache (a "cache NT) Is answered very quickly. If the cache 
is sufficiently weft designed that it has a high hit rate, the overaN performance of the memory 
wifl be nearly as good as that of the cache itself. 

A cache must be designed to maximize the hit rate by holding those memory 
items that are Rhety to be addressed. This is usually done by assuming that the addresses 
being used very slowly with time, and so, when an item is referred to once, it is likely to be 
referred to again toon, and should be placed in the cache, therefore, when an item is 
eddressed which is not in the cache (a "cache miss"), the datum is fetched from main memory, 
pieced in the cache, and also returned to the user. Subsequent requests for thet datum will 
be cache hits. 



The size of the "items" that the cache contains affect its performance. A cache 
for the main memory of e conventional computer may use rather large items consisting of, for 
example, 8 consecutive words. This is effective becausO references to memory, especially 
instruction fetches, tend to be localized in space. When a cache miss occurs on any word, a 
block of 8 consecutive words is read from main memory and loaded into the cache. Since 
references In the immediate future are likely to be in this block, the lilt rate is increased. 
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The structure memory for • dot* flew computer haft no such locality of 
reference. TTio?**^ tt» untt # **ch» or^ 

Placing an Item in the cache usually rw^ert* removing seme other item. The 
most popular strategy, end the one that wttl be used here, 4s the least recently used" (LRU) 
strategy. Each reference t© a cache Item is noted in some tort of reference table. When 
space must be made in the ceche for a new datum, the Hem that his been used least recently, 
that is, has tone the longest time without a reference, t* < 



When a write command is issued, the item in the ceche is updated 
appropriately. In some cache org enkstmns, the item in mem m e mory is always updated also. 
This technique, known as "write through*, wM net be used here. Instead, the item in the 
cache will simply be marked as having been medNled. When an Item that has been modified 
must be displaced from the ceche, it is first writt e n mte main m e mory . This method has a 
lower volume of com m ends going from the each* t» mem i wu e rj then the Nrrtte through" 
method. 

It is cruciet that the cache be Sbta to dtter mine very quickly whether or not it 
contains a given word. Since its memory space m much ■m eti er than the fuH address space, it 
must store the fuH address with each Horn. When a comman d Is received, the ceche. must be 
searched for en Horn with the given address. % Is important that the search be conducted 
quickly. 

A popular method of organizing the cache for rapid searching is the "set 
associative" memory {12] . The cache is orgenited es en array of columns and rows. The full 
address space is simnerly organised, with the same, number of columns, and a presumably 
much greater number of rows, fipeh item .in the cechs is constrained to cqri^s^ond to the 
same column in the full address space as its own^ostamn m the ^osche. Therefore, to search 
for a given item whose fuN address is known, the address m separated into row end column. 
If it is in the ceche, it must be in the same column as its column address in the reel memory, 
so only that column of the ceche need to be searched. Furthermore, onty row addresses need 
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to be stored in tha each* atone with the Warn*. The column addresses are implicit from tha 
poaition in tha cacha. 

This organisation work* wall for a suprisingiy tmaN number of rows in tha 
cacha. For exs mp ia , tha main m a m a ry cacha on the IBM 370/168 computer has only four 
rows. (Tho numbar of rows k referred to as "cacha dspth".) To dStsmOna whothar a givon 
itom is in tho cacha, only four address comparisons naad fo ba made. Thoso can easily bo 

^WaW^aT V*>j^Mswt^M^s^OJM^wVjr • 

Tha column numbar of a word in tha fuN addrass space is typically taken from 
the tow bits of its address. The row numbar comas from th» r ema i n i n g bits. TNs allows 
con secut ively addre ss ed items to reside to tha cacha to adjacent cotumns of one row. 

Example; Suppose the full address- space contains 4096 addresses, and 
o ddt e sses consist of four octal tfgits. There are 8 columns, snd the tow digit of the address 
is tho column number* The cache depth is three* 

column number 
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row address 561 560 543 504 444 425 426 425 
data ABCDEFGH 

row address 412 417 447 313 314 315 270 241 
data I J K L M N . P - 

row address 242 242 242 242 246 271 365 413 
dsts QRSTUVWX 

The cacha holds the itom with address 4472, with data "K". Whan a command is 
received requesting the contents of location 4472, tha sddress is dhridsd into the row (447) 
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and the column (Zh Column 2 of tha cacti* ir thsn «wcriea for 447. It contains 543, 447, 

and 242. 447 ^ iimnpmed e tUi t h aaa thr o e r O J Wi iHi aHliH W ui^ It matches tha second of 

them, ao tha data ■ M a c i stsd with It (10 i« returned h> the user. 



Wh*n»naw itatw.irtcvb»pwt Ih»> ma caatw; it» oe « u iw > number is known in 
advance, *o only. ih»nm» mm* tea? dot a l i i wmd iby, ei aw i Mey tr m art ja aw fo i ' the la—t recently 
usad item. . For •««**»* if an entry for M^tmrnt av e raiteti, co l uam 4* it searched If tha 
west recently usad ihwnis, 344, it Is removed. UHm m v m *t)t W*<m t m UEB packet is sont 
to main memory, containing the s d a V as s (8144) and? the data (Mi Thr row address is than 
charajed to 212. 



The* d et s rmiwt ton of which Item tn- » c c - t mm v was least retenWy used can be 
made by some simple, ash a mo nuofa efr heaaJ m > a* i o u » l e i mm e y »nn the date for eeth^ item. 
Whenever any referenee is-made* that itam'* counter is set to aero and aM others in its column 
are increaaad-by, onm Tha^aaat^ recently used item i»tha>oiwwltt^tlm Mf^eat count. 



Because each operation in tha c siha wwotiom ■ us i n l n a lH ii of an entire column, 
the cache memory iteetf snoutd be orfenteed w that aaah oohimn b a "word", that is, the 
entire c o l umn Is- reed-or written at once. 

aZl DESIGN OF CM 

The f um^onal cpactHcatien of CM « very steed* W moot rejettee f^ through 
its "top" ports arri raaltoe f Tfffflfm throtah itt "jntiamf- nertfc 
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'CM 



If (CMDl.MEMI)- input porta, and (RESO, MEMO)- output port*, 
(RESO, MEMO) c ^CMD!, MEM!) If 

(1) RESO c 'm/CMD!) 

(2) MEMO € f^^MEMl) 



An implementation of a system realizing f^ will now bo given. Each word of 
tho full address apaco la In one of eight etatea denoted nLP»P > . aOJ\ R, R\ and T. 

N - The word is not in tha cacha at aH. (Sinco the cache it much smaller than 
tha full addros* space, most words ara in this state at any instant.) There 
are no pending commands from the user to the system. There are no 
pending commands from tha cache to the main memory. 

P - Space has been reserved in the cache for .the word, and at least one FET**' 
has been sent to main memory, but no LOAD*** has come back. One or more 
FET^VlOAD*** transections are pending to tha cache. Exactly the same 
transactions are pending to the main memory. 

P - Same as P, but a CLR packet has been received from the user. One or 
more FET^/LOAD*** transactions, plus a CLR, are pending to the cache. 
The same transactions without the CLR are pending to the mam memory. 

Q - The first LOAD**' has come back from mam memory. A CLR packet wMI be 
sent es soon as main memory is able to accept H. Zero or more 
FET < * > /LOAO <±) transactions are pending to the cache. Exactly the same 
transactions are pending to the main memory. 
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<F - Same as & but » GLR parte* bat baa* retswed feern the user. Zero or 
mam Plf^/iaie^ tr am a ct io n s, pbje a OH, are aendmg to the cache. 
Th»uMkamw»omwmi^th«Cy?ar*p«*«nttoth»mmRwffiory. 



R - Tho word ie in the casbe, but tern* ?&*#&&* Wm*MX*m m** **W tm 

in p r o p/ ass In mem n mam y . A CW ascfcst bat bee* ni te remove them. 
No CtH pochet bee baa* received from lb* user. aWavar mora 
FET^/tOAS^** transaction* ara p e nding la H#'edKmev The same 
tra n sa cti o ns, o hm a CUt, are aa w al wg to the i w s ls mammy . 



r - Some as. fl but a Off aothet baa boon roi e tt s d from the oaar. Zero or 
mare ¥&**&/&* trsw sstti s mv pbje a OH, ere p s i w J iw, to tba cache. 



T - Th*wei*ist»ar*i*tr»«eehe. Tlwro are r» a ori fre ; h e waih e i i t to the 



The nemo* state* for a word are ttor % e^ejawdtof m w no tfta i Ola word is in 
the cache or net, to slate T, a* seams*** ara o il e d u aa w hw i n aatal ati % the eeefca without 
any lom i no is io th Mi odJh main memory, a* state $ am/ m f iim wlf bow the oaar causes the 
word to un d ergo ba a ilbom that aventuafty (WtfT i» lis bam** **» X IT ffte command is a 
FET***, the word mist be read from mam maiiwry, a^ tba atabr eaaa thraufh soma of the 
inter m a d la bf state* W tt* cammem* i* UPT£f«e »** m ereefad In m* cache in state T. in 
either eaaay soma saber ward may have to to Jbip !»■» galmj from sfafo T to state M if the 
"modWy* flag for the* ws wf l a e n, awumarn^ lasmttto » m m u ^ ^ 



The s pa s ffl i e t Hi i w of MM and Its oaar reeutre tbtf the mm accept alt result 
pacheta fr o m M a t Nettie only ro»»*e# te actae* aawmeXmr who* the rest** of previous 
commends haMO boon s u op l i J by the aaar (elh s u a li e# afHcMnt labJwwnfatr aii of MM might 
■How many commando to be in progress at 6mm.) rtm*m*>\#9rmW lib avoid a deadlock, CM 
must accept peehetk from main m s mory , at MBit, avow whan mam mamery refuses to accept 
any further commands through MEMO. CM sometimes moat wait for memory to accept a 
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command. While It it waiting, it may refuse to accept further commands at CMDI, but it must 
always be willing to accept peckett at MEM1. CM may assumt that any packot sont through 
RESO will be accepted. 

The reason why CM allocates a cache cell for an item and puts it into state P as 
soon as the first FET*** command comes from the user, is to avoid a deadlock, that is, e 
situetion from which ths system cannot proceed. If it simpiy sent the packet out through 
MEMO and did not allocate the cache cejl until the «r«| LOAO^ packet c.me back, it would 
use its own space more efficiently, but would be in danger of deadlock. (P cells ere useless, 
since they do not contain data.) This wiH be explained in section 60. 

In the following description of ths cacha algorithm, the maniputetton of the 
counters to determine the least recently used item is not shown. 



STATE N 



FET^addr, tag) ei.QMDI - Create space in the appropriate cache column. 
Either use en empty space (this situation can only arise when tha system is 
first sterted) or remove the least recently used item In state T. If no item 
is In state T, waif until one enters state T, not accepting any pockets on 
CMDI whHe waiting. (Items in other states wiH progress to state T.) When 
the item to be removed is found, write It out if its "modify" flag is on, by 
sending eh UPD packet at MEMO. If main memory is not accepting packets 
at MEMO, wait until it does. Then create a new »em in the cache with the 
given address, "modify" - 0, state - P. Leave the data end reference count 
fields unspecified. Also, send a FET ( * ) packet, identical to the incoming one, 
out through MEMO, to fetch the data 

CLWeddr) et CMDI - send 00NE(addr) at RESO 

UPO(addr, data, ref) at CMDI - Create space in the cschs as for FET***, perhsps 
sending an UPD packet to memory. Than create a new item in the cache 
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wtth the gton i#m» "modify" • 1, date ami reference count from the 
command, and ttate - T. 



LOAD*** or DONE at kCfcB - cant occur because no transactions are ponding 



in 



STATE P 

FET^^«fckM«^at(M)i-Sa^th»«aii»p*di^»tMEWa 

CLRCaddr) at CMM - Change to state W. 

UPDfeddr, dot* re® at €MGt - cant hanpan, ernes t ransac tions aro ponding in 
tho cache. 

LOAD <A V«ddr, data* ro|, tag> at MB* - Q a paait Mat date and tesatenee count 
Into tho aoaiio wepa\ and: aand the same pocket out at RESCk M tho main 
momery w accepting commands, sand a CUKaddr) at MEMO and chango this 
each* Warn testate ft tr not, chango to atato ft 

uutat at Rmsa ~ can i happen, ainao no GLft has boon geion to- main memory. 

STATE P 

FET***, UP©, or CLR at CMD1 - cent happen, tinea user has a CLR/OONE 
tranaecnon penoing. 

LOAO < *HadaV, data* rot, tag) at MEMI - Deposit the data and reference count 
into the cache word, end aand the tame packet out at RESO. If the main 
momory it accepting comman d * , serid s O*ado» at MEMO and change tNo 
cacho item to atato R*. If net, change to atato Q*. 



79 
DONE at MEMI - cant happen, tinea no CLR has bean givan to main memory. 



STATE Q 



Note: CM does not accept any command at CMDI whenever any item is in state 
Q. Q is simply a temporary state that is waiting to send a CLR(addr) out through 
MEMO and go into state R. 

FET***, UPD, or CLR at CMOI - can't happen, since cache is not accepting 
commands. 

LOAD*** at MEMI - same as state R 

DONE at MEMI - can't happen, since CLR has not bean sent to main memory. 

Main memory becomes able to accept a command - Send CLR(addr) through 
MEMO, change to state R. 



STATE Q* 



Note: CM does not accept any command at CMOI whenever any item is in state 
Q*. Q* is simply a temporary state that is waiting to send a CLR(addr) out 
through MEMO and go into state R\ 

FET ( *\ UPD, or CLR at CMDI - can't happen, since cache is not accepting 
commands. 

LOAD*** at MEMI - same as state R. 

DONE at MEMI - cant happen, since CLR has not been sent to main memory. 

Main memory becomes able to accept a command - Send CLR(addr) through 



so 

MEMO* ehango to ttetoR*. 



STATE R 



FET^addr, tag) it CMH - Update tha roforanco count in th* coch*. and tat 
th*%edHy-blttr(h»p«lwrwa»PCr«r«ET 4 . SOnd UMO^addr, date, 
rmwraf, tag) through RCSO, whar* date and nawrof ar* eorront contentt of 
tho each*. Note: at tho inattnt tMt boppont, thoro may •till ba 
FET^/UNO*** tranaadfam ponding In main mtmory. If to, thoto FIT 1 * 1 
paekott war* oarbar than th» ona, but tha coc r atpc i ning LOAD**' paekott 
wont bo rotumad until Mar. Thia la tha c t r tm a ttant t which causae tha 
gonorat tyttem MM to occtstonaHy roturn VOMP* pacfcote in an ordor 
dlfforont from that of tha FET*** paefctte, 

UPtXaddr, data, rof) at CMOt - Update tha each* aat tha "omolfy" bit. Note: if 
an UK) pechot w roca+vad wMto hi •ttte ft, wo know from tha rutoa for 
MMU8ER that nt &!**&»&* transact!** ar* pandjng to isjto momory. 

CLRtoddr) at CMDI - Chango to ttato RV 

lOAD^addr, data, r*f, tag) at MEM - tenors tho ^or* fteJd in tha packot. 
IncrOmant or dacromant tha rofaroneo count in tha coeho » tha packot is 
WAD* or 10«*. 00 net aat th* a ^nao1fy*fta| 4 stneo mom momery alroady 
know* about tha rotersne* count chango. Sand lQ*tf*Wr, data, newrof , 
tog) through RE80, whoro nowrof - tho updated roftfOmo count in tho 
cacha. 

DONEteddr) at MEM! - Chang* to ttato T. 
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STATER* 

FET<*>, UPD, or CLR it CMD! - cant happen, since UMr has • CIR/DONE 
transaction pending. 

LOAD { *> at MEM - same as stata R. 

DONE<addr) at MEM - sand D0NE<addr) through RESO, changa to stata T. 
STATE T 

FET<«(addr, tag) at CMDI - Update tha reference count in the cache, and set 
the "modify- bit if the packet was FET or FET*. Send LOAD^W, data, 
newref, tag) through RESO, where data and nawref are current contents of 
cache. 

UPCXaddr, data, ref) at CMDI - Update tha cache, set the "modify- bit. 

CLWaddr) at CMDI - Send DONE(addr) through RESO. 

LOAD ( *> or DONE at MEMI - can't happen, since there are no pending 
transactions in main memory. 

3.2.2 PROOF OF CORRECTNESS OF CM 

A proof of CM*s correctness it generally similar to that of the system MEM 
given in section 2.0.3. The memory state required in tha specification is the contents of the 
last UPD packet in the input history. One must show that, for a call in states Q, 0/, R, R\ or T, 
the data in the cache itself is the same as that in the last UPD packet at CMDI, and, if the 
modify bit is off, this data is in main memory also. For states N, P, and P, the correct data is 
in main memory, that is, the last UPD at CMDI has the same data as the last UPD at MEMO 
These properties must be shown to be preserved for aN state transitions, end it must be 
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shown that all legal FET (±) commands will get the correct data. Furthermore, the effect of 
reference count modifications resulting from FET* and FET" commands must be taKen into 
account. 
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4.0 IMPLEMENTATION OF MM USING A WATING" DEVICE 

"Rotating" memories «jch at charge coupled device (CCO) or "magnetic bubble" 
•Nft registers, or magnetic disks, art righHy considered to be essentially unusable for the 
main memory of a computer because of their excessive retrieval delay, In e data flow 
computer, tote) transaction rata is m important a criterion as retrieval delay, and so the 
dteedvantoges of these devices largely dfeappsars, making them pefhapa economise! as a mass 
•tore. On the other hancL further improvements in MM technology may render these shift 
registers Obsolete for most application.. This section is predion* on the assumption that 
CCCTa or bubble memories wiH be economical and useful in the pactum memory system. 

In a rotating memory, the data is structured in a ring which "rotates" past a 
"read/write head". Equivalently, one may think of it as a fixed ring and a pointer rotating 
around the ring, with memory transactions permitted only on the cell currently pointed to. If 
the addresses of words correspond to fixed places on th. ring, it is possible to predfct when 
any given cell will be pointed to. Commands from the user con be stored In a memory 
somewhat like a queue, sorted by position, so that the pending transaction at the head of the 
queue is always (or nearly always) the one that the pointer wHI roach next This will make 
optimal use of the avaHabttty of data from the CCO. 

There ere a number of CCO architectures currently in use. In the "line 
addressed random access memory" (LARAM), only a smell part of the device shifts at full 
apeed at any one time. The rest shifts end recirculates at a much tower speed in order to 
conserve power. The intent is to make the device behave somewhat Hfce a random access 
memory. To retrieve any one item, one finds the section in which that item is stored, and 
directs the CCO to shift that section st high speed until the desJre*ttem is found. While this 
is happening, the other sections are shifting much more slowly, so this architecture is not 
efficient when many items are being sought st one time. It is therefore not suitable for the 
type of packet memory system being considered here. 

Two other types of CCT/s are the "serpentine", which is simply a long shift 
register (it "snekes" beck end forth on the IC chip), and the "seriel-parallel-seriel", which is 
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•imply • collection of interleaved shift registers. Dmw two type* differ only in engineering 
spedfketions such m date rate and power mmamptto i i. They both behave Hke long shift 
registers, end hence ere Mttabte for the type of m e m o ry undsr dmcwton. 

The?* are e nu mb or of Implem e ntation cwwMer otter* ihet mutt be token into 
account in d e si g n i ng e rotating poehtt momory. f%r nmm, • nu m ber of ghift registers, one 
for ••eh bit of e dote wove, teey be used, so fhet a new dati word comes into position on 
eeeh clock pefee. On the other bend, • stogie shift register aftjht be used, with eech word 
stored serioffy, or any err migo i imnt eofweon thsot two eitrei Oi i ten be used One might also 
use en unusuat corresponds*** bet w een edo>eet and shift register position. AN of these 
consideretiens are irreteveet tethe structure being c s m Wrs d, so we wfl estume the memory 
is • ring of fufl words, ordered by address, with address sere following the highest address, 
end the pointer scam** the ring in order of i mr e e H hj a dor es* Any other Im p lem e n tation is 

In the following, the a m morj wW be inferred to at the TCP", regardless of 
whet type of devke W eeteefty is. 



Ponding transections ftbet is, petftet* received tt CMDT) ere stored in the 
transection list (TU which is presumably inueh smeBor then the memory itself. The TL is 
presumably reeRsed with e random access memory dovtee. In order to avoid moving data in 
the TL ufirwcses^yjt hsserMg structure )u»t WwthtmmWNy. Trentsetior* are placed in 
the TL et or neer the seme engutar pesmon m the position in memory of the word to which 
they refer. &nse the T4, « d smabsr ring then the memory, eeeh address of TL corresponds 
to many con se cu tive sd a Vi i f si of mem o r y* 



Let Ct* be the function mapp i ng addresses In the entire address space into 
the corresponding oddrese in the TL This is csWed the hisfr function far reasons that will be 
expleined later. €X» is Jest the Integer part of the o uot io nt of * dMoed by the ratio of 
memory sire to TL size. In a realization in which at tthht ere powers Of two, €X> is just the 
appropriate number of high order bits of X. 
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When i commend it received for address X, the command packet is placed in 
the TL at address <X», or the first free address thereafter if <X» it fuli. Assuming a 
uniform distribution of addresses appearing in commends, tha TL should be uniformly filled. 
As tha mamory pointer rotates through tha mamory, another pointer, maintaining about the 
samo angular position, rotates through tha TL, picking out tha next transaction to perform. 

Tha TL is organiiad much likt tha "ordered hashfabia" devised by Amble and 
Knuth [2] , with modifications to allow for it* circularity and for tha fact that items »• being 
removed from It. In an ordered hash table, each Item has a hash address. It is pieced in the 
table it its hash address or in the contiguous bloc* of items after the hash address. TNs 
block is in increasing order of data value. This ordering makes it possible to determine 
whether en item is in the table much more quickly than in a conventional hash table. 

Although ordered hash tables are intended for entirely different applications 
than the transaction list of a packet memory, the concept it well suited to this application. 
The -value" of an item in the table It the word address appearing in tha packet. Let e<P) 
denote this address for packet P, and call .it the "CWO address". The "hash address- 
corresponding to CCD address X is just «X», defined earlier. (Hash functions are usually 
designed to be random, but that property is not desirable here.) The hash address of packet 
Pis therefore €a(f»)>. 

Because the TL is a ring instead of a linear list, a different definition of order is 
needed The concepts of "grOater than" and less than" are replaced by 'clockwise from" and 
"counterclockwise from". Since any item i. both clockwise and counterclockwise from any 
other item, the order of two items must be defined relative to a third. TNs is done through 
the use of interval* denoted in ordinary, mathemeticel notation. ^X, Y] is the interval from X 
clockwise to Y. If X £ Y, it has its customary meaning. KK> Y, #, Y] is the -t of numbers 
from X up to the highest address, snd then from tero up to Y. *Open* and *h»lf open" 
intervals have their customary meaning, that is, [X, Y) means [X, Y] exclusive of Y, etc. [X. Yl 
and [Y, X) are clearly complements of etch other if X V. 

The ordering of hash addresses and word addresses is expressed in terms of 



whether or not mi 
moves clocKwi>e, ww 



» in an interval Zc{X,Y) 
Z before Y. 



thai if one starts at X and 



Th» fenereJ rote tar mstnts i n l iw, order in %M that, if one *eet clockwise from 

AAA lAAVBAm.VJh .pjhalAalaOjA ^-^AafL-^-i-h^-K J&gA^ -JIALa. EB\~*t^-— .. ILA^al JAAaaaBBL' -aA^flflS A^Aaaaft. AkttlftA aMa a< ^A,AaaB^aBBA>A- AaL^flatAA AAJI AAAAtaal — — — ■ - A^AamflAi.*l 

an nenrs nasn eooross to ma tna npow, one wRWHpw any eso^y coes^ ano ww -pees omy 
"a i ii a f oi " t t ow n. that la, items wh aia hash ■ j» > w a i mm touo ta rclortiwi sa from thk ona. This 
ie boat mustratod with a dtaajreav. Lot OCQ, oaenjiots bo tejp ejptel esejts end hesh oddrossos 
bo ono dbjlt. The hoah hjncwon pocks otd the first dbjft. 1^ bwasoetion list has & cobs endei 
drown aa a carle. 




Cobs and 6 ere empty. Gad 2 conta i ns a packet with address 16, whoso hash 
is 1 but was dtop totod because cai 1 Is fuN. 



It ImM kAlAaIWbA '^LJA^AA) 60b\^B, SMAAkA J AhAU#^M' ->S>UGbW jOwAbgJu^^jAUdAAS^AjJj^- ■ ^-L^A^A^^^fc^^^i. .^aV^A^-^L-^^OiA-. < '^l-^Oa~ -kM—jAafcABa 

h is pooooNesof 1 m irensseiton' ssi wcomem several pocnois referring 

CCD MldraMftV. StatfailaaawaV Ma*. {aaOaaaaat flakafaMaflaSaVMI B*aY MMtaataat* 

WW ■WjH-aaat BBBaaaaaaaaanj; : #flaT faaaBarfaa^^aajajaaaBiaBaBBBj^B^B; aaaaaamynw; 


One or more fET*** pocket* When the GC8 pointer reochos the appropriate 
aekiresv it* data wiN be read andsonT of 
LOAD 1 ** packets. 



One or more FET**) packets, fol i o wad by a CUt 
the appropriate address, the LOWS 1 ** pocket* 



*»TiJn^'$a»petnter reaches 
be sent out, foNowed by 
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a DONE packet 

A single UPD packet. The data will ba writtan into the COO when the 
appropriate addraaa it raachad 

No ether states ara pot$iWe. TNa i* becauM it b a violation oH,,,^^ to send 
an UPD packat whan thera are TET*** or OB petsets .pajidbajMlf an UPO ia «»**» when an 
UPD ia already pending, the naw ona tiMpjy raptacaa tha oW ona. If a ,l»3g^Je given when 
an UPO ia poncing, tha data i« taken ejnajjftr from tha pending UPO packet and returned hi a 
LOAO < * > packet. 

Intuitively, the rule for a wall formed transaction list is that tha lines 
progreaaing clockwise from a ceH to those items with that ceCs hash address must never 
cross each other or pass over an empty csH. If an Hem with CCO address 43 were placed 
into cett 6, this rale would be violated, since tha line from 4 to 43 would cross the Hne from 5 
to 55. The Insertion algorithm must instead put the 43 into ceH 5 and move the 55 to cell 6. 
Furthermore, aN items with the same hash addraaa must be ordered by OCO addreea. In the 
example, 16 ia clockwise from 1 1. 

To insert an item, start at its hash address and search clockwise until an empty 
ceH or a ceN containing an item with higher (more clockwise) OCO address is found. In the 
former cose, insert the new Item. In tha latter case, Insert tha new item after making space 
for it by pushing the old Hem, and aN those contiguously following it, one apace clockwise. In 
the example, insertion of item 10 would require pushing 11, 16, 25, 32, and 55 clockwise. 
In sert io n of 42 would require pushing only tha 56. 

While incoming command packets ara being placed in the TL by the above 
procedure, packets ara being removed and sent to tha OCO memory. This is accomplished 
through the use of a transaction list pointer (TIP) which rotates clockwise roughly in 
synchronization with tha OCO address pointer. Whan tha the OCO pointer point* to CCO cell 
10, the TIP points to TL address 1. Since a packat for address 11 is found there, it waits until 
the CCO pointer - 11, removes tha packet from the TL, and performs the indicated operation 



U 



on the contents of CCO address 11. The Ttf is then immedietely edvenced to the next 
position, 2. Since th. packet there specif** «***« if, « w«t« until the CCO pointer - 16 
end then removes the packet end perform, the memory operation. The TIP the moves to 3 
•nd the praeees oenttnues. 



The remove) ef items from Tl make* tt necessary to modify the rules for e 
wott-formed trepseetie* Net M 16 is removed from the etempte tit. the Hne from cell 2 to 
item 25 peseee through en empty opN, which wmM vtotete the condition given previously. 
Therefore, the region*,* white packet, art remevW t, ee«t ered tela. WWvel region-, 
ond it m p«Nieesllle tor Ihp lliie fmm, •» it^ 

the remove! region. The remove! region m delimited pt its counterclockwise end by the 
"removel pointer" •#, and ft it clockwise end by YIP. Af^ romovhej U end 16, the exempie 
'dee> thtst ■ 
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removal ration 



ovai ration 
-[RP,TLP> 



Whanavar an item it ramovaoV HP to tat te tha hail* addratt of that item. In 
tha axampta, after 25 it ramovad, RP wal ba tat to 2 <25't hath addwtt), and TLP wiH ba 
advanced to 4 

Tha rukt for a waH-formad trantattton Mat can paw Aa jhwn formally: 

(1) V J, k c Tt addratt tpaca, W j 0kai<dTUJ)*ifmp&i«TUkX 
[€^TUj)»,j] € [€a(TL(K))>,K] 

(That it, tha intarval from thatatb adaYat* of an item to tha item ittalf it navar 
containad wtthin tha eorraapandinf Interval for anothar item, i. a. tha Hnaa navar crott.) 

(2>Vj«[RP,TUH TMJ)-imDty. 

(That it, call* in tha ramovai rag ten ara contidarad to bo amply.) 

(3) V j, k c TL addratt tpaca, if TL(j) - amoty. * TL(k) and jd[RP,TLP), 
jn>[<a<TL(k»>,k] 

(That it, tha Intarval from tha hath addratt of an item to tha item ittalf doat 
not contain any amply caMt not in tha ramovai rag ten) 

(4) V j, k € TL addratt tpaca, if €a(TL(j))> - <a(TL(k))> and j c [ <a(TL(k)» , k 1 
than a(TUk)) 2 afTL(J» 
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(That is, if two items have tha same hash s d d ress , the mere clockwise ona has tha higher 
OCO address, i.e. aU tha packets having one hasfcaddress &m ordered by CCO address.) 

(5) V j, K c TL address space, II j e [ «s(TL(J»» , k ). and aOUj)) - a(TL(k)), 
thanVm«[j,kl e(TUm» - a(TUj)). 

(That is, aH items with one CCO address ara adjacent. This is necessary to be sure that, 
whan a sequence of adjacent FET**) packets and m CUt ere found, it is possible to 
return the LOAD*** packets followed by a DONE, wfthfiOdanger that there are unseen 
packets elsewhere r eferring to the same CC8 address.) 

(6) V j, k e TL address .pace, if j € [ €t(TL< j»> , k ) and e(TL(j» - e(TUk», 
then Tl<j> was placed in ths 1able before TL(k) 

(That is, the Kama wtth the tarn CCS ■daraw em ord e r ed by age, tha youngest being ^ 
most clockwise.) This property makes it possible to return a DONE packet as soon as 
a CLR is encountered in the removal scan, since tha packets ere encountered in the 
•ams oranr as may were eriginany recew^R 



The insertion algorithm reejuiree tame cere whan passing through tha removal 
region. If the scan starts outside of tha region; andlhanentaratha^agion, the Rem is placed 
in the first ceH, andtha fegian is shorte ne d by^aona a» that H i al ceW"te no longer par* of the 
region. If the -seen- bagint tn4he iwgion but net »n#» wi^ s alt, th e s c an skips over the region 
and starts after its end If the scan begins in the first celt of the region, it skips to the end if 
its CCO address is greater than or equal to that of tha Item Just past tha end. Otherwise, it is 
inserted in the first cell sndtha region is shortened 
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removal region 


To inserts 


PotMt: 


22-27 


put at 3, set RP :- 4 


30-33 


put at 3, set RP :- 4 


34-35 


put at 6, push the 36 and 43 


36-42 


put at 7, push the 43 


43-77, 00-07 


put at 



Tha algorithm for intartinf an item into the TL is given in appendix III K If the 
TL etreedy conteins en UPO packet for tha tame address, it instead performs the indicated 
action, perhaps modifying the UPO packet and psrhsps transmitting • packet et HESO. 

The remove! algorithm is somewhat »impler. The TL item pointed to by TLP is 
nawt to be removed. The CCD pointer indketas the current item evattjpte et the CC© output. 
From the standpoint of the algorithms for hsndtog the JU the CCD pointer must be considered 
to be inexorably advancing under control of an extern* agency* The *xternel agency is the 
clock controlling the shifting of the CCD shift reader, or, in the ces* of e magnetic disk 
memory, it is the information being read from the disk'* timing tracks. 



The fact that the CCD pointer is synchronized to external events means that it 



cannot be Integrated ftm> into a eystem uateg the packet e o m mun i cst i on principle. It mutt be 
considered eatofnel to the packet system, and so m s s yps h r s n i esc i «r «rMr«Han devices must 
be used in the interface. The .deafen of *oc* an ederiete is •common problem of digital 
system design, end is beyond the seeps of this thesis. We wK assume thet tht interface 
between the t ynchr o n o m m s m a ry oavtes and tna r » an*st sy » * s w qp n ei it i of ports COM and 
OCfXX Every time tha OCO advances to e naw address, en AOOR peeket containing that cell's 
address and data are sent to the system through port CCOI. If the system feils to 
ecknowlodge the AOOR packets fast enough, so that tns CCO i* prevented from sending one, it 
may either drop tha pocket or waH vntHtha CCO has shifted ati tha way around to the same 
address again. After the system receive* an ACOR pecfcet a* CCOI announcing thet an address 
has been reached, it may transmit a WRITE packet at £000, giving the address end new date 
to write. If this packet is not transmitted soon aw**jh, it might be toe late to write the data 
into the CCa In this cue, the OCO ahttts sM the way around, net e mi tt i n g mwf AOOR pockets, 
until the address is reached again, end then writes ths data. 

Westing an entire rotation time whenever the asynchronous part of the system 
cen't keep up with the OCO clock may seow drastic, but it doe*n*t heppen very often. 
Whenever an asynchronous system must commuwest s with something auch as the CCO clock, 
there is the pessibitity that It may be tot* however, it is net dtfhcult to design the system 
such thet tha probability of this happenin g Is vs ni sWugl y smatL If this is done, it is possible 
to prescribe drastic remedies when it does occur, without sifflrflcenHy degrading system 
performance. 

The above description of th*1nterfsca to the COO may be somewhat simple- 
minded. Many memory devices require that tha write commend, and the data to be written, be 
gjven before *he previous dats from the same address ts svstteWe. This means that the 
protocol whareby the system isauas eWRTW packet enfy attar receMng an AOOR packet 
bearing the data might net be appropr iate. In ^-eiia ol tWdr Other shift register, the 
problem con be served by having two laps" on tha rsgistar: ens for reading, and another, 
one or two etta later, for writing. In the ease of a disk msmory, rj* fwbmm it mere serious, 
end may require thet the disk announce eaih address sflfhMy before the data becomes 
eveHebJe. The neceaaary modHfcstiom to tha •synchronous part of the system wHI not be 



' &■ &**?■ "* •'■■■ £ ^ 
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treeted here. 



The rotating memory moduli then looks like this: 



CMDI 




Tho romovsl algorithm waits for an ADOR packet at CCOI matching the address 
contained in the pocket in the transaction Hst pointed to by TLP. When found, it performs the 
indicated transection, perhaps sending a packet out at RESOi It then sets RP to the hash 
eddress of the item which was just processed, which may shorten the removal region. The 
item is then erased from the transaction list, and TLP is advanced to the next position. If TLP 
now points to an item having the same CCO address, that item is processed also, using the 
same data. AH transactions giving tho same address are handled in this way. Any reference 
count changes ere noted, and the modified reference count is written back into memory with a 
WRITE pocket at CCOO. 



When TLP reaches a cell which does not contain a transaction for the same 
eddress, either it is for a different address or it is empty. In the former case, the system 
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waits for the CCD to reach the new address. In the latter case, it sets RP - TLP, destroying 
the removal region, and then advances both RP and TLP, in step with the ADDR packets that 
give the CCD address, until it finds a transaction to perform. 

The algorithm for the rotating memory is given in appendix III B. 
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5.0 STRUCTURE CONTROLLER DESIGN CONSIDERATIONS 

In this taction wo will examine • few of tho considerations that must go into 
tho design of an efficient structure controHtr. 

5.0 .1 CHECKING THAT THE CONTROLLER OBEYS F |i(B8ER 

The structure controller mvtr issues an UPO command unlets the reference 
count is known to bo ono. Sine, this it so, there can, bo no transaction* pending on that coil, 
so tho requirement* of f| iia|R aro mat This is contingent, of course, on tho rest of the 
computer correctly realising f O0liP8UiWJBB , . A reference count «Watton, by the compute 
could load to an UPO packet baing sent while; thsr a ara transacWons#on*ng, 

5.0.2 PRECISE REFERENCE ACCOUNTING WITH IMPRECISE REFERENCE COUNTS 

In chocking that f^ satisfias tha needs of the skucJure controller, there is a 
point of possible danger that need* to bo checked Since LOAD*** packets may be returned 
from tho memory in an order different from the) of the FET*** |ms**U, H was shown in 
section 3.0.2 that tha reference counts returned from tha memory may be unusual, perhaps 
oven negetive. Is it possible for this to interfere- with tna c^ nwnagament nwclwni.m? The 
answer is no, as long as tho following rule is oboyedt 

After increasing s reference count (with a FET*), do not pass tha result to any 
destinstion until the corresponding LOAD* has returned. 

For exsmple, if an instruction cell indicates two destinations for its result, the 
reference count of the result must bo increased with a FET* before the result is sent to the 
destinstion cells. If one of those celts is a SELECT thai issues s FET to reduce tho reference 
count, the FET* must set first. Furthermore, it is not enough to rely on the xoro latency 
arbitrator to bo sure the FET* gets to the memory Colore the FET. The FET' must not bo 
aont until the LOAD* arising from the FET* has returned This it oc com p H s hs d by not sending 
tho result to tho destinetion ceNs untH the LOAD* has been received 



It if easy to sm M no MH wHt faff to be reclaimed that should be reclaimed. 
At the time the test "owner* of a ceM issues a FIT* te discard it, there are no other 
operations pen** on the eeN, so the UNO" peeaot the* la returned wW have the correct 
reference count, which i$ wro. 

To aee that no ceN wiH be accidents*? ro rtoha o d that shouldn't be, consider e 
cell with reference count ?, owned by instruction ce«« X end Y. Suppose X performs a 
structure operation that dtecards fts copy, so thaf a FIT i» issued. We must show that if Y 
does not discard tts copy, the LOAfT that arises from X*» operaHen will not have reference 
count zero. The onty way the reference count could possibly fi to aero is if Y also causes e 
FET". Since Y does not intend te discard its copy of the cat, a FfT* must have been issued 
first. (That is, the reference count should actusty f0 up to 3, than down to 2 and then 1.) 

The memory receives the following seq u e nc e at CMDt: 

Fir(addr,X) » rtT*(eaoV,Y> { Feeder, Y) 

i ■ ■ 

The situation to be avoided is that in which the second and third LOAD packets are reversed: 

L0A01addr,~>l,X> f imO*(oaaV,-, 0, Y) j t0AD*<eddr,~, 1, Y) 

This cent heppen, because the FET"(addr, Y) is net sent untt) the lOnO + (eddr,~,-,Y) has been 
returned. 

5.0.3 MEMORY LATENCY 

MMTs latency was left unspecified only for the purpose of proving correctness 
of MM end its user. When actually implementing a practical peettet memory, it may be 
neceseery to build e Mgh dog roe of latency into tome modules In order to obtain good 
performance. For example, a Rotating' i mplementator Of MM using a charge coupled shift 
regleter may be demaned to have toners* or thousands of commands pending at one time, 
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although its correctness does not depend on this. 

5X>.4 THROUGHPUT AND DISTRIBUTED PROCESSING 

One of the fundamental principles of data flow computers is that, if enough 
parallelism exists in the program, a computer be able to run arbitrarily fast for a given logic 
speed. To do this, it must distribute the computation and be free of bottleneck*. If a data 
flow computer could only have one multiply unit, that would be a bottleneck, since it would 
limit the rate at which multiplies could be performed The data flow concept must not place 
any restrictions at all on the number of multipliers that a computer can have (although any 
given computer of course has a fixed number). There must not. even be bottlenecks in ports 
through which packets must pass. If every multiply operation packet had to pass through one 
input port of an allocator on its way to the multipliers, that would be unacceptable, since the 
logic speed places a limit on the rate at which packets can pass through a port. For example, 
if a port could handle packets 100 times faster than a multiplier could process them end all 
packets had to pest through one port, It would mam that no more than 100 multipliers could 
be usefully employed. 

In the case of simple functional units such as multipliers, It is not difficult to 
evoid bottlenecks. Multiple functional units may be used, and the arbitration and distribution 
networks that connect them to Hie instruction cetis may be designed to be free of bottlenecks 
end thus maintain any desired throughput rate [5] .. For the same reason, multiple structure 
controllers are used, each with its own ports connected to the arbitration and distribution 
networks of the data flow computer. Also, multiple memory units ere used, because the total 
memory transaction rate is greater than can pass through a single pair of CMDI/RESO ports. 

It is not possible to compartmentalize the structure operation facilities as can 
be done with simple functional units. Connecting each structure controller to one memory 
module is not correct, because each structure controller must have access to trie entire 
memory address space. The structure controllers must be connected to the memories through 
•n interconnection network consisting of arbitrators and distributors for packets going in each 
direction. Command packets from the structure controllers have part of the address field 



removed and used to select the output pert of the distributer, just as was done for the 
muttiplo memory connection in section 3.1. In this way, each structure controller "teat" the 
full mMtom space, while each memory module supports enty a em* pert of the total eddress 
space. The commend peeheta from the different etructure controllers are merf ed in 
arbitrators, which append the Incoming port number to the teg hot*, ee that the result packet 
will be returned to the correct controller. Packet* coming, out of the RESO ports of the 
memory module* peas through distributors that use the added Wtt of the teg field, and 
arbitrators that use the incoming port number to reconstruct tha fuN address. 



I 



01 removes end 
uses pert of 
eddress to select 
output port 



Al inserts 
input port 
into tag 
(except UPD 
packets). 




A2 inserts 
input port 



02 removes and 
uses part of 
tag to select 
output port. 



The treatment of address fields and tag fields is symmetrical. One could think 
of all pending structure operations as occupying a tag space". Just as each memory module 
supports a smef} part of the total address space, each structure controller supports a small 
part of the total teg space. Th* Job of tha interconnection network is to make the entire 



eddroea apace available to each structure controller, and to make the entire tag apace 
avails to each memory unit. 

It ia net necessary for tha network to piece the distributors bafora the 
arbitrator.. Such a natwork would have a ciz* eroec<tto* to the product of ttie number of 
•tructuro controllers and tha number of memory unite, which may ba excessive. It ia possible 
to mix arbitrator* and distributors In a natwork in auch a way that tha efze ia reasonable but 
bottlenecks are avoided 

Bacauao UPD packet* do not have a tag held and do not give rlao to result 
packota at RESQ, K m neseseary that tho erWtrstors and distributors carrying packet* from 
tha structure controMera to tha mamory imowee***** labelled Al and 01 in tho preceding 
diagram) have latency zero, TWe J* eo that, whan a etrueturo controller receive* an 
acknowledge for on UPO packet, It will ba guaranteed that tha packet hat potaod through tha 
arbitrator end m therefore ahead of any packet that may lubi a eu antiy be introduced into 
enother input of the erbttrster. Suppose this were not done* Orm **uetw» controller might 
write on a cell, thereby completing the creation of a etreeture. When it receive* en 
acknowledge for that UPO command, it aaauataa that the structure i« complete, and *o it 
return, it to the rest of the computer. An instruction cobVin tha computer, having received 
thia structure, may fire, causing a SELECT operation to be generated. The allocator may send 
the SELECT operation packet to enother structure controller, which than sends out a FET 
pocket with tho same address. If there is buffering before the arbitrator that merges packets 
from tho two structure controllers, tha original UPO packet might still ba in such a buffer, so 
the FET packet passes through tha arbitrator first. If this happens, the old data will be read, 
rather than the new data supplied by the UPD packet. By making sure that tha distributor 
end arbitrator have latency zero, tha UPD packet cannot get stuck in a buffer. When the first 
structure controller receives an acknowledge for tha UPO packet, that packet is known to 
have been accepted by the arbitrator, and hence it wiH precede any subsequent FET packet. 

If it is not feasible for the interconnection network to use distributors and 
arbitrators that have no memory, it is necessary to put tag fields in all UPO specification 
passing through the network. An "adapter unit" is placed between the network end each 
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mamory module. The adapter passes eM peefcets threttffe eaeept UP® packets. When it 
receives lJ B rXacMr, data, re* , tug), W eaaai UWK s »» , OeM, «ef) &» iha mamory endtJAGKftag) 
©ecK to tha intarconnaetian network Th» ste MCtww ^c an too aor d i jg i imt return a <frudw> to 
the rest of tba c o w a aiar unW # has racnivsd UAQK *apa» *or «M 1*0 co m mand s that it has 
soni. Whether each DISK pwMl lfi #•***•* «4MmMm «f thsdnigri «f efficient routine 

5.0.5 THE FREE STORAGE USTS 

To maintain just ona free tterafe <W wwuld create * bottleneck, so aach 
structure co^fi>t>ir Iww oaa. IMNtwi iaT a at waiJtiMa mi i iglt i i n eeds a*Qrd in order to 
craata a node, H lain* Its adjlrees #cm teeeaelarteeeieelsd at mpidpert WBL HJtB stands 
for ua i q i m ja a wUfl a r i 31m atnwhme laidi ellii aaas. iiH m* Im setfrsoisi at tlM? they are 
suppitoo ei en iHsmneeie ejceaaiies teat esOA0yore4 



■•weasMmBmmi up wmwi m vHH-anr ma mi fminpi mnmpimii men or 
mfeMi maintains a 4<m) fiiraje Net ami eendt eat odt^mmms 4hmi||h eotaid pert 4400. Thi 
uhju parti ana aoaaaalaa eo w >mhM eorw4hsna^eea , opmlse!ew aesMldtare end orewalors 
celled tha 4&fejej*jk ^m*mm&Wk miftmm* »* metadata a euppty of free ceMs to 
eX 



UNI 
(from U1D0) 
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UIO network 




UNO 
(toUlQI) 



UNO 
(to UIDI) 



Each structure controller, in addition to performing structure operations, 
maintains a fro* storage Hst, Whenever an acknowledge !• rocotvad on WDO, it takes • cell 
from the list and transmits Kin a UO packet throgfh UJOO. Sine* * reference count scheme is 
wod for recovering unused coUs, the controller watches fw words wbsse reference counts go 
to zero. Evory tima It reduces ■ roforonco count by issuing • FET command, it anaminat tho 
LOAD" packet that is returned. If It shows t reference count of zero, the word is reclaimed. 
This involves placing the word in the free storage list and, since whatever pointers it 
contained are destroyed, reducing their reference count* if their stow bits ere off. If either 
or both of the letter reference counts go to zero* those words ere reclaimed by the same 
process. 

The procedure is recursive, and is an unpleasant type of recursion because the 
completion of each operation can produce two more operations to perform. Although the 
recursion always termbtatet, a huge amount of storage may be required to hold the list of 
words that need to have their reference count* reduced. The problem at its worst can be 
observed in the case of a large tree, no subtree of which is shared with anything else, whose 
root node is discarded. All nodes have an initial reference count of 1, so, when each node has 
its count reduced, it goes to zero, making It necessary to reduce the counts of both of that 
node's offspring. 



To implement this procedure by simply issuing two FET" packets whenever a 
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wonf $ reference count gees to zero (that is, whenever • 10AD" It received bearing a count 
of mo), would crooto on Intractable deaateefc probl e m because of the proWarstton of packets. 
Instead, the procedure that should bo wad it that only the right offspring of a word should 
bo traatod at tha time the word la ptacad on the free atorago Hat Tha pointer to the left 
off spring will remain in the ward white H is on tha free storage Bat The recursion in this 
procedure ia under control, tinea only one new operation « created for evary operation that 
Is completed, When a word is token from the free storage Met the reference count of its left 
offspring Is reduced, which may cause one or mora words to be rscteimsd, before the word is 
used. 

The memory management algorithm is at fotiewet 



(1) Wh e n e ver a worst reference count la reduced, examine the LOAD" packet 
that la returned. H H shows a count of xaro, pot the word on the free 
storage hot and, if the efcmbn^ fe right he* »; tare, reduce the reference 
count * the word pee** to by *^ he*. TMs may cause fnk step to be 



(2) Wh e n e v er an s ch n aw l edga Is received from port iflOO, got a word from the 
free storage net and send tha packet taOeddr, Its toft half) through mm 
(The contents of the toft half era sent simph/ to avoid an extra memory 
reference.) 

(3) Whenever a froth dM la needed for creation of a structure node, toko the 
packet UtfXaddr, ob» at port U!0! «ntf acknowledge same. Mdr is the 
eddroea of the new cod. H tha ejem Wt of obi I. off, *•**• <*» reference 
count of the u o t si mJ word. Thlt may cease step <I) to be invoked. 
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S.0.6 MAINTAINING INTEGRITY OF THE REFERENCE ACCOUNTING MECHANISM 

The possibility of an orror in tho reference eecountirej end cell management 
mechanism is • troublesome problem, because, as. mtpiataed in section 24*1* it i« impossible 
for tho momory to dotoct a roforonco accounting error by its usar. Furthermore, tha effects 
of such an orror ara unpradkteble, and may ebaw up in comatttery unrelated parts of tha 
computation. Howavor, thoro are a fa* things that can bo done to minimize tha probability of 
auch an error beiraj undetected. 

First, all cails on tha froa storage Hst can be marked in tome way, perheps by a 
bit reserved for this purpose. Any reference to a marked ceH other than for tho purpose of 
removing it from tha free storage list is a detectable error. Also, the frea storage fist can be 
organized in such a way that cods are added at one end and removed from theother, thereby 
maximizing the time that a cell stays on tha list once « is put there, «M a colt is erroneously 
reclaimed while a "spurious" pointer to it exist*, it wiM than probably etHI be oo the free 
storage Hat when the spurious pointer is used, so the error can be detected. 

Another way of checking integrity of reference counts is to conduct en "eudit" 
of the entire computer. This can ba done at tha end of the computation, end at any point 
during the computation. The host computor must disable eN instruction cells and wait for all 
pending operations to dear out of tha structure controller, end the routing networks. All 
reference counts can then be checked against the content, of the input registers of the 
instruction cells. 
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6.0 THE DEA0L0CK PROBLEM 

The structure controller and cache module that were described previously were 
both required to have • large capacity lor state information which would be unnecessary if 
one could always be sure that the devfee tower in the hierarchy would accept ■ command. 

In the case of the structure cont r oll er, the general behavior upon receiving a 
result packet from the memory is to perform some transformation on the data in its state 
memory and then send a new command packet. Its internal state memory could be dispensed 
with, and the state information placed directly into the ttf fields of the pockets. When a 
result pocket is received from the memory, a "memervlew- controller'* functions would then 
be simply to perform e transformation on the packet itself , forming f now packet which is sent 
to the memory. The reason this foHe is that one tent bo euro the memory wont decide to 
return several reeutt packet* (perhaps all pending Ones) before ft accepts any more command 
pockets. Suppose tMs happened to a memerywss structure cohtrollef. It would have no 
piece to put the reoatt pockets if the memory umt isn Y sceepttrig any mere commands, so a 
deadlock would occur. The problem that the controller has vietatacTthe rule thet it must 
always be prepared to accept the results of aN pendhg operations. A structure controller 
heving state mem o ry avoids this problem by always having, space to absorb the results of ell 



A similar problem arises in the cache modulo. If a word is not in the cache and 
a FET*** pocket is received, a cell is tmmsdktely allocated for it and placed in stste P. A 
FET < * > pocket is also sent to main memory to fetch the data. Until the data returns from the 
memory, the ceH in the cache does not have data in it, so it serves no useful purpose. It 
might seem to moke more sense to aMocete the cache call only whan the first LOAD**' packet 
is received from the memory rather than when the Hrst FET*** packet is received from the 
user - that is, to bypass state P altogether. The problem is that the creation of a coll in the 
ceche may require writing out the coifs former contents. If the ceN is crested in consequence 
of the LOAD*** packet coming from memory, the cache may have to send a packet to memory 
in response to a pocket from memory. If the memory sends such lGA0 < * , pockets but does 
not accept any replies, the cache would have no place to put the data, so a deadlock would 
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occur. The och. imptem^tatiof. given in section 32 .void, this problem by reserving .pace 
for the L0A0<*> packet in advance. Jf » UPD p«*d .wd*. mt+Wm-mmmr, » l» don. 
in respon* to input from th. u^r rdher than from th. mtmorv. Thm w.y, if th. memory 
t.tnpor.rily refwe. to .ccpt th. UPQ, th. CKh. c*n .impiy reft- to Kc.pt input from it. 



UMT. 



In both th. .tructur. control and th. cache, th. cod incurred M . r.«ult of 
thi. problem i« an .mount of memory .qud to •* th. pec**, that en be dmult.n«>udy 
PO"*ng in dl Jower levd* i„ th. controller, N* » th. deteJnformdien for aft concurrently 
executing .tructur. operation.. In th. e .eh., . c.ll might b. in ,t«te P for .very 
FET<*>/L0A0<*> cyd. thd i. pending t «irt Jnrt«H, Sine . edl * rtrt. P i. u«!..,, th. 
c*h.mu.tb. that much lerger than it othrwi- would b., for . given mvd f performance. 

In th. cm. of th. drueture control, th. memory space is needed somewhere 
in any cose. If • great number of rmmory traejectiei* o»t be pending dmult.nMu.iy, • 
"Votrt^g- memory, <uch HWN described in action 40, m predumeMy being uwd. H a 
•nwnoryb*. .tructur. controller i. u^oVth. dde infermdion for pending option, i. stored 
m the tag fMd. instowl of th. controller. But th. tag. of pwdtagmeiee^o^tioni mud be 
dored in th. transaction M of th. retding n»mory, «, whatever space w M saved in th. 
eontroM^ » uwd up in th. trwiMdlon Hd. 

Why, then, would • memoryless structure controller be more desirable? The 
reason i. that memory sp.ee Inside th. controller i. much mora expensive than in the 
trensaction lid. The centroltor mud b. abf. to process infermdion -a. fad „ th. highest 
lev.l of th. memory hierarchy. If thd highest level » a cache udng Wgh speed (and 
expensive) devices, the controller mud be eejjaHy fad. The rotting memory | 8 .t th. bottom 
of the hierarchy, so it. transaction list can us. a dower and toss expend, logic femily. 

In ordar to use a memorylew drueture controller or . cache which does not 
us. T- oils, th. memory system below the controller or th. cache mud obey the following 
"fixed latency law": 
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W h e n e ver e result packet h t ransmitted at NES& the device mutt accept a 
pocket at CMDL if 4het pe at m l to en tfleYft fasti etcept -fit another, until it 
hat ta k e n am that wt to* UHX ft fmjet**!*) even if the user does not accept 
anything further at feSKA 

Tha reeeon UP© pockets ara a ipacW caaa it Mat May do net generate any result, so the 
system should' ho ebte to absorb thaw in u n H mi tad 1 



avnw nwmy ejmsiRe eevy mm »sw* m rename susses impmiiiei nation ot 
clearly does. A rotating ia mt e m c ot e tton can um% emeu om transaction fat has fixed size. 
Whenever an item is tshan out of the 1i» another eo»ba msert o d. flip Im pl aiunnt atton of tha 
rotatinf memory ghmo to aoeHon 4A «n not «h*«v* behove HO* way, but ft couM easily be 
modified to do so.) 

Tho systoate that da not obey Hit fixed latency 'tew oro the horizontal 
composition of MM unit* and tha cache. Ifcs tomar MtfcaJM thft Inhweennwtton network 
between the structure cent* o us r t and the m o mm y laifts. In tt* cose Of the horizontal 
intorcopiioetion of eio% a 

result pocket, it wtt accept e now t aw ma mt That reaatt ptehtt eecse* through tha arbitrator 
and becomes a result of tha Jntoroonaectiow, ao #w tHirtaji i ajilMn : most accept another 
command. If tha co m man d Is ad d r et ied to a dWtoroot M M WW T ha n the one Thef transmitted 
tho result, that unit mbjtt not ba aMo to accept ft What * iweded * a way for tho units to 
aharo tho burden of eendmg kemmthonc with each ether. 



In tho case of the cache, m etntsm l mj • cenetsnt number of pending transactions 
in the cache and me m o ry c a mbin a d reewirss iwih i te lm nt a constan t number of pending 
transactions in tho m emo ry awns. Far every result pechrt tnmwdtted by main memory, 
another command must #a I rem Mm ceeha to moth memory. Hewevef, wen commands only 
occur when there are cache misses. If the cache rum into unusually good tuck and gets a 
continuous string of cache hits, it wouM not send t oineia nda to niamWy . In Order to maintain 
constent latency, it wouM have to refuse any resut eeo»urH from inemory. UNs could result 
in some trenseettons remaining pendmg (ndofmitety. WNtm this praboMy won't cause a data 



s a weffw wiffi 



,mt# jMww i wwww i pj i- » ! " * "< ■» 



rV VO ?"> 
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a ,itt* *i 4 m $rrt«b«yr* ^^sfgflit %-#jnti^'. srf! so tnC- 



drcuitry to b» 




iSfWff ft**? C*? 1 !! >,?;?: 



*d ?w.s ■*»?■ <a3 JO* «?* *3 ?|W srtf #W 4JQA) •SP"* 1 ** fisitqi-»*#0 
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7.0 SUGGESTIONS FOR FURTHER RESEARCH 

Ons Of the principal problems remaining m the area of the design of systems 
using the pocket co imkurb BOt t o n p*r*b*r 1« thb d S vehJp l w a rir of * practical and systematic 
procedure for cemtruelir* ***«•« that can be prove* t« meet given functional specifications. 
An important toot for tMt took is the J>» a l e pm e nt of t rigorous and concise Architecture 
Description Language (A0L)l Wtth tr« help of ths AOL, ths task csrt bs divided into two parts: 

(1) Development of a proof methodology so that systems expressed in the AOL 
can be proven to meet functional speetfteebent. 

(2) Development of a system construction methodology so that systems 
expressed m the AOL can be constructed with co n fidenc e that the physical 
device will realize the AOL expression, 

For this purpose, the AOL must be simple enough to correspond neetly to the 
herdware devices involved, but powerful enough to make proofs involving history arrays 
tractable. 

Another remaining problem is, of course, to develop functional specifications for 
eU parts of the date flew computer system, inetudrng the structure controksr, end give proofs 
of their correctness. The functional spedfleebon of the computer itself (that is, the structure 
controller's user) is needed, among other things, to show that no reference count violations 
will occur. 

An efficient structure controller needs to be designed, with special attention to 
the needs of programs that are likely to arise. 

The deadlock problem needs to be examined cerefuNy, to see if it is worthwhile 
to build a memoryiess structure contreNer. 
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APPENDIX 1 
Proof that the concatenation of two FIFO buffers it t FIFO buffer, and lengths are additive. 

This proof is given not became the statement is of fundamentel interest, but as an example of 
the method of proving theorems about the b eh avio r of systems, showing acknowledgment! in 
detail. 

Let a FIFO of size M have input port X and output port L 

Lot another FIFO of size N have input port Z and output port Y, 

end let the ports Z and the acknowledge ports Z A be Jinked 



X X 



YY. 




From the definition of tha first FIFO, 



<l)|Z|-min{IX|,|Z A |*l} 

(2)Z,-X | 

<3> PC A | - men f DC| , ^| ♦ M } 



From the definition of the second FIFO, 



<4)m-a»n{1Z|,1Y A l*l} 
(6) Y, - Z, 
(6)|Z A |-min{|Zl,|Y A | + N} 
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Cm* I: Suppose p<| £ |Y A | ♦ N 

By the strong form of ths Stsndsrd Acfcrowlsrif Restrlctten, 
•tthtr W-fly or |«-tg*l 

m $ W (from 1) 

.*• |Z A | < PC| 

.*. |Y A |*N<|X|, which i«acontr«#tttoo,ww.«MI**»*»«ti|-^ A l 

IZI-W toml.+mW+V,ji>*y 

.'. |Y| - min { Mhtjrgl* J } * . (from 4* *, i >. v ^ 

|X A | - min { (X| , |X| ♦ M } (fro* 3) 

•*. uyw (*»«*«> 

.*. |X A | - min { PC| , jV^I ♦ M ♦ N J 



II: Suppose |X| > |Y A | ♦ N 
If|Z|-|Z A |,then 

H\ - M (from 1, since |Z| * IZ^I ♦ 1) 

m £ Ity ♦ N (from 6) 

••• WiJT^*liwW©h»«o^ri*di©iwtowimiirtdHWP«iH^J*l 



|Z A |-|Y A | + N 






(from e« eweo 


•*• I2|-|Y A |*N^ 


y 1 




..-.■.. ,k. 


.-. f* h \ + i$m 






WnceN^O) 


.*. |Y| - |Y A | ♦ 1 






(from 4) 


|Z| £ |X| 






(from 1) 


« |Y A | + 1 £ PC| 








.-. |Y|-min{^J 


•W 


U 




P< A I - min { PC| , |Y A | ♦ M 


♦N} 


(frem3s«d$ 


In either esse, 









|Y| - m*« { |X| , |V A | ♦ 1 } 



113 

Yj - Xj (from 2 and 5) 

|X A | - min { |X| , |Y A | + M + N } 

which are the conditions for the interconnection being a FIFO of length M + N. 
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APPENDIX II 



Algorithm for the each*. 



Actual lookup in the each* is not shown. Instead, the special functions 
cache-data(addr), cacha-raKaddr), cocht-itstetsdorV end <aths ma*idOT> are used. Theee 
•ro troatod aa though thoy were arrays, and ara a ssum ed to bo dafinad whonovor tho givon 
address exists in tho cache, In-cacha( addr) raturns true. W tho givon addrost exists in tho 
cache. 

Can-crootei addrk whara addr doas not wrist in tho cacho, tails whsthsr it can 
bo created, that is, whether some ceH in its column is unused or is in state T. 

M can-create( addr) is true, crestion-caH-is-empty< addr) tells whether the 
former case holds, and, M to, cache-crest al sddr) performs the Insertion Into en unused ceH. 
Otherwise, cell-to-dlsi»lace( addr) returns the address of a coN in state T, selecting the (east 
recently used item. C etha r e nam efo id. new) performs the replacement. 

• processes start at 0, A 
input ports CMDI, MEMt 
output ports RESO. MEMO 
var cmd, item, addr, data, ref, oW-sddr, p 
verminjt false | toMs whether to wait for input from MEM 
var momof lag Injt |£jft | true whan lest packet ami at MEMO has boon acknowledged 
var memowait injt false | true whan need to send something en MEMO 

var wait-pkt | the thing to send 

var create-f lag injtt false | true when need to create a new cache ceH 

var create-pkt | command that tad to creation 
var new-addr | address field of creste-pkt 
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Q: 

wait for acknowledge on port MEMty 

take the acknowledge* 

memoflag :- truej 

gotoQ 



A: 

until_mamofleg or. packet is available on port MEMI gcj 

•"«- !a!!fi | becomes true If should tato pectiet eMi4EMl 

if memoflag than | it memory ready for command? 

jf ^ome-cell-^e-^n-stete-Q-©r-<} , Jhen | aaa if naad to aond a CUR 
addr t- eddress-of-a-celMn-stste-Q-or-Q'j 
momoflag t* falsa; 
aond CUKaddr) on port MEMO 

if cacha-tUta<»ddr).lQ'thafl \ change Q to R, Q» to ff 
cache-state(addr) :- "R" 



cache-statofeddr) :- "R* " 

ejse jf memowait than 1 see if naad to sand FIT 4 ** aftar creating a cott 
momowait :- fateat 
momoflag :- fajsej 
aond wait-pkt on port MEMO 

else if create-flag than | see if trying to create a cell 

if can-create< new-addr) than | i* soma call in its column empty or in state T? 
create-flag :- false | yes, wW craata tha call 

if craation-call-is-ampty( naw-addr> than 

cache-croatot new-eddr) | old cott empty, just pot In now addrass 
elsa 
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oW-tddr .- eolMo d>el » ce < rww-»tfdr) | find ceN to displace 

|f cache~mo4etd^edO>) then 

momoftai ;« fi)w | wrtte out prevtea cetfOHtt it neeesoary 

send UP OMd sddi . ct hs oa tate t d a d oH c sch o r s f(o l d o fd rtt 0q pert MEMO 

etaMhO^Bka^aaMAe^^a^Mia^BluJaJiVdW " **— -- 1 -j- -*JL- *. . I jh^^t^A^ ftl^^ *-"■*---— ■ n M 

wun'Tiniiiiiw iwr, niw'Mrii | create too new con 



I the new each* cell now exists 



|f create-pkt - UPtX--,--,--) Uaa I who! command caused the creation? 

!ft create-pkt - UPDK date, reft | UPSt f*> In now lot appropriately 

cth»-fflod(n«w-»ddr) ^ truot 

cache-data(new-addr) »* dotoi 

cache-ref(new-addr) :- refi 

caehottotolnow ad*N»T 
•to* | command was flT*** 

cache-meoXnew-eddr) «• ftwoi 

cacha-statefoew-addr) j- "P"j 

wait-pM ;- c r a ttt pK h | oueiw s om ai e ii tl foMr w i <mt ii l » rt to memory 

WfflOwtt ;- true 



m :- true | can't create now cache ceH, mutt waft 
•»so 

wait for psctwt anMEMI or CMDl, M P -that parti 
If p - 'CMDT then 

| +++++ process packet from CMDI ♦♦♦♦♦ 

cmd :- RCVPKT(CMDIh 
If cmd - Fltf*W^ then 

jftcnrf-FET^addntofJi 

If ln-csche< «ddr) then 

H cech a -at eta<eddr) - T* then 
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momoflag :- falsei | state P, just tend H onwactf 
sand cmd on port MEMO 
•!•£. JststelsRorT 

|f cmd - FET + (~,~> then | need to update reference count? 

cache-reftaddr) ^ cache-fefleddr) ♦ 1» 

cache-mod(addr) :- true; 

XMTPKTCRESO) :- LQAD*(addr f caehe-detefeddti* c«che*ef(addr), tag) 
ejse if cmd - rTT(-,~) than 

cache-ref(addr) :- cacbe-ref(addr) - It 

csche-mod(»ddr) :-trui 

XMTPKT<RESO) t- LOAO"(«Wr, cacho-data(addr), cache-ref(addr), tag) 
else 

XMTPKT(gE90» *• LOAtXaddr, caeh^<«a(addr), cache-raf (addr ), tag) 
•tee | stato N 

naw-addr .- addri | sot flags to cod will bo created 

creete-pkt y cmdj 

creste-flsg :- true 
eke If cmd - UPO(--,--,--) then 
M cmd - UPTXaddr, data, refh 
If In-cacheC eddr) then | mutt be state R or ? 

cacho-detafaddr) ^ detai 

ceche-reKaddr) :• ref; 

cache-modtaddr) :- true 
else | state N 

new-eddr :- addri | set flags so ceH wW be crested 

create-pkt t-cmdj 

create-flsg :-true 
fjse | must be CLR 

|ftcmd-CLR(addr)j 
It injcachejeddr) | state P, R or T 

If cache-state(addr) ■ TT then 
cocho-*tste(addr) :- "FT " 
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#J22. if e*eh»-«t*M«*lr) * T** ttwn 

•tat | •*•*• T 

fllf. |tt*t«N 

I ♦♦♦♦+♦ •nd 64 CMDI prownftng »»» » 4 * 

fJif. 

m »• Joa ( PMtwt wat from lt*MI 



•Jm 

im- traw 



if m thm 

I ♦♦♦♦♦ procM* packet front MBWI ♦♦♦+* 

it«m :- RCVPKT(MEk«h 

If item - LOAD 1 * W-.~,~> then 

Ift item - LOAD^wWr, tfate, r«f , tegfe 
if e«ch«-ft«t»<*«r> « T gjBi | know it it In c*h» 
cadw-tfattfeddr) ^ dat« 
cacho-roftakfr) n* rofj 

xMTnrrmes^^iiMt 

jf motnoffig ggfn | eon sond poctwt it kCMQf 

iwomoftoi ;» f lim |ytt 

sond CLR<>ddr) <m POft MftPi 

c«cho~*tote(oddr) :*> "R" 
•M 

eocho-ttoteteddr) j- "Q" | no 
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else if cache-state<addr) - "P* " then 
cacho-data(addr) :- data* 
cache-ref(addr) :- rafj 
XMTPKT(RESO) :- item; 
If memoflagthen | can send packet at MEMO? 

memof lag v fafast | yaa 

send CLWaddr) on port MEMOt 

cacho-state(addr) :- IT " 
fjse 

cache-«tate<eddr) t- tr " | no 

•jm | must be state Q, 0*, R, or R* 

If item - LOAD*(-,~,-,--) then J update ref and send LOAD 
cache-ref(addr) :- cache-roffaddr) * U 
cacne-mooXaddr) :- truei 

XMTPKT(RESO) :- LOAD*(addr, data, cache-reKaddrX tag) 
gtee jf item - LOAD'*-,--,--,-) then 
cacho-rof(addr) :- cache-ref(addr) - 1; 
cache-mocKaddr) :- true; 

XMTPKT(RESO) :» LQAD"(addr, data, cache-rof(addr), tag) 
eke 

XMTPKT(RESO) :- LOMXsddr, data, cacha-r af(addr), tag) 
•he | must be DONE 

tot item - DONRaddr); 
if cache-atate<eddr) - "R" then | know it is in cache 

cache-stete(addr) t- T 
else | must be state "IT " 

cache-state<addr) t» Tl 
XMTPKTCRESO) s- D0NE(eddr)» 



| +++++4 end of MEMI processing ♦♦♦+++ 
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goto A 
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APPENDIX III A 
The inMrtion algorithm for the rotating memory. 

flag - fajse | becomes true if TL already hat UPD packet for this addraat 



P.-Ca(X» | scan pointar - hash eddreea MtiaMy 

if RP*TLPandP-RPand. | hath addr - start of ramoval ragion? 

«aoo> * €a(TUTLP)» gt «<x) <*mmm tjaa 

TLWr-Xj | insert item at P 

RP .- RP ♦ 1 mom I ahortan tha ramoval ragion 

pop:- pop + 1 | updata TL population 



if RP* TIP and Pc [PP. TIP) than | hash addrass in ramoval ragion 
P?-TLP I odvance to end* r*nevat*egion 

I rapaat until find ompty call or ontor ramoval ragion 

un«(P«RPandRPiiTLP) or TLP(P) » amotv or flag-trjw do. 
( 

I eee if TL already has UPD with sama CCD addrass 

jf a(X) - a(TUP» jnd TL(P) - upd<--~,~) than 
flag^lj 



it «afJL(P)» « «e(X)» ajgatX) < a(TUP>» 

or Ca(X)» * [ €a(TL(P))> , P ] | Is X "amatlar- than tha currant item? 
than 

Y:-TL(P)» | sava itam f rom TL 

TL(P)p-Xi MnaartXhara 

X :- Y| | insart saved itam in next cell 

(which pushes everything past hare) 
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P ;■ P » 1 mod M | advaneo P to noxt cofl 

I find out whothor to inaort X or pmoas it olrocHy 

If wotflagthon linwtlt 

If P -RPanjJRP uTtP 1 1 

then 

TL<P):-X» | (mart Bam it P 

RP »» HP ♦ 1 mod M | ohorton tho rim o w ai fogon 

OJM 

TUP).-* | insart itam at P 

pop :- pop* I | update Tt po p ulati o n 



fjto JfHWoaartdlroetty 

M TUP) -URfeddr, data, roft 
If X - UPO(-,--,~) ttw 

TUP):-X | anothar UPD, now ona ropiacas oM 

•I— if X ■ FET(— .—) thon 

lot X - FETt-.NvJI |FET, got** data 

XMTPKT(RE8Q) m UMIXaddr, dote, raf, tog) 
oJm jf X - FET*<--,--) than 

lot X - FET*<-,foi* (FET+, fat tho data and update rof 

TUP) :- UPOteddr, data, rof+lfe 

XMTPKT(RE9® :- LOM^iaddr, data, rof +1, tag) 
oteo if. X - FET~(~v-> than 

lot X - FET(~^og)j | FET, tot tt* date and update rof 

TUP) :-UPOCaddr»daU, rtMk 

XMTPKT(RE9» j- UMD~(addr, dtt«, roM, tag) 
ohM | mutt bo CUt 

XMTPKTCRESO) * DONEtaddr) 
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APPENDIX HI B 



The rotating mtmory algorithm. 



process starts at A 
Input PQrt< CMDL CCD1 
output ports RESftCCOO 

ver P, X, 2, addr, data, r«f, tagiCCD-etfdr, pop Wt 0, TL-cmd, 
CCO-dste, CCD-ref, CCD-newref , TLP, RP 
rrayTLsteeM 



A: If TL(TLP) - smotv thtn 

RP j- TtPt | destroy ttw removal region 

w»^TL{TLP)-»m E t^«odTLP*€CCO-«Jdr>do 

( 

TLP x» TLP ♦ 1 mod Mj | advance untH catch up to CCD-addr 
RP ^ TLP | keep removal rogion destroyed 

k 

| look for Input packets 

if pop^M-1 

than | TL naarly full, cant take packets at CMDI 

Z .- RCVPKT(CC0t)i | wsrt for and accept psckat at CCOI 

wz-Aoo«(cco-«kk,ccr>^^« ) cco-ftf)j 

OCD-nowraf :- OCO-raf 
Sfeo I can accept packet on •ittwr port , 

wait for packet at CMDI or CCDT, set P :- that port | nondeterminate! 
If P ■ XSOr then 

Z s- RCVPKT(CCDI)i | accept packet at CCDI 

IflZ-ADDWOCO-addr.CCD-data.CCD-refH 

CCD-newref :■ CCO-ref 
elae 
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X :- RCVPKT(CMDl)j | take packot at CMDI 

| ♦ inMrt or otherwiao dtepoaa of X 
I* (from appondta HI A ) 

| perform ati transactions machine OCD-addr 

whHo TUTLP) * omoty and tiTUJiP)) * CCT>-#f«lr djD 

< 

TL-cmd t- TUTLPk | remove transaction from Bst 

TUTLF) t- omotyi 

peps-pop-li | update Tt population 

RP :- <efTL-cmd)»} , | tOortOR romoval r a p ia n appropriately 

TIP :» TIP ♦ I mod M 

If TL-cmd ■ OfttOCP-addr) than 

owe If TL-cmd ■ FET(-.~) then 
tot TL-cmd -rTTCaddr.tagfc 
XMTPKT(RESO) v* UMMaddr, CC O d a ta, jQCO nawrfff , tag) 

obo If TL-cmd ■ FET*(~.~) than 
kg TL-cmd * r^TV**. tag* 
CCD-nswref .}• CCD-nawrpf ♦ J* 
XMTPKT(HESO) s- ljOAO%ddr»CCQ-det» CO O opwre f, tag) 

else If TL-cmd - PET"(-y-) than 
ISl TL-cmd * FEHaddr, tog* 
CCO-nowraf v CCD-nowref - U 
XMTPKTSHESQ) «- lQAO"(eddr, (XaWats, CCO-newraf , tag) 
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SlSS. I must be UPO 

|et TL-emd - UPTXaddr, data, r«f>} 
XMTPKT(CCOO) .- WRITE(addr, data, ref) 
h 

| rewrite reference count if it has changed 

if CCO-ref * CCD-newref then 

XMTPKT(CCDO) :- WRITE(CCD-addr, CCD-data, CCD-newref); 

goto A 
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